Workflow¶
To use the gold-miner suite to attempt classification of unknown
traffic, the following steps should be done in turn:
Analyze a set of labeled training pcaps containing known, encrypted protocol traffic using gold-miner-trainer
Combining those individual training results into a single, aggregated training profile by using gold-miner-trainer-aggregator.
Use the gold-miner tools with the training profile to analyze unknown traffic in a PCAP or on an interface.
The steps below describe this process at a higher level, and the individual tools pages above provide greater detail about how each of the tools work.
Note that all of these steps can be executed in automated fashion by using the Test and Evaluation Suite, which takes a a YAML configuration file, creates a profile, analyzes the data for accuracy and produces an HTML report. (an example report is available to see what the output looks like). The Test and Evaluation Suite tool may be easier to start with instead of running each of these steps by hand.
Also see the Additional Tools document that describes
additional useful tools that are distributed with the gold-miner
package.
Steps to Classify Unknown Traffic Samples¶
The process for using the rapid classifier involves Generating individual training profiles that measures a sample of labeled traffic to build a profile of what each type of traffic looks like. The results of each measurement needs to be combined into a resulting single profile of all traffic. After these steps are completed, the resulting training profile can be used to analyze an unknown traffic stream to see how well it may match a known profile.
1. Generating individual training profiles¶
To generate individual training profiles based on each type of known,
labeled datasets use the gold-miner-trainer command. It can analyze
any number of pcaps and generate a starting statistical dataset to be
used in later steps. [Hint: Use an output filename that reflects the
type of data being analyzed.] Example usage:
gold-miner-trainer -T -o web_traffic.fsdb web_traffic.pcap
gold-miner-trainer -T -o mail_traffic.fsdb mail_traffic.pcap
In these examples, the web_traffic.pcap file is analyzed and a
web_traffic.fsdb training profile file is produced. A similar
example is shown for (e)mail traffic.
For further information see the gold-miner-trainer tool documentation.
2. Combine individual training profiles together¶
Once the multiple individual training sets are created, they must be
merged before giving them to gold-miner below. To merge them, use
gold-miner-trainer-aggregator with label/file pairs to create an
aggregated training-profile.fsdb:
gold-miner-trainer-aggregator -o training-profile.fsdb \
web web_traffic.fsdb \
mail mail_traffic.fsdb
Note that the arguments to the script include a repeated set of pairs of
a generic word as a label (e.g. web) and the individual training
profile for it (e.g. web_traffic.fsdb).
For further information see the gold-miner-trainer-aggregator tool documentation.
3. Analyzing an unknown traffic source¶
Now that we have trained parameters, we can analyze an existing pcap file or watch an interface for traffic of interest. The output will be a string of FSDB (tab separated) data representing confidence values. We assume we have a protocol of interest matching one of the profile names in the training file (“mail” in the example below).
gold-miner -r unknown.pcap -p training-profile.fsdb -g mail
This tools, by default, generates a tab-separated list of output data that can be easily parsed. A confidence value is given per traffic type being detected that can be compared against other types to determine what the traffic most likely might be.
For further information see the gold-miner tool documentation, which also goes into greater detail about the output, describes the other output format options, along with specifying other sub-algorithms to select between.