Zeek-based Network Traffic Data Analysis
This tutorial is based on the “Okay-boomer” exercise from https://www.malware-traffic-analysis.net/2019/11/12/index.html. If you haven't downloaded the exercise PCAP yet, check the Dataset preparation section with all essential information.
Dataset description
For more details, please check the original exercise page.
LAN segment data:
- LAN segment range: 10.11.11.0/24 (10.11.11.0 through 10.11.11.255)
- Domain: okay-boomer.info
- Domain controller: 10.11.11.11 - Okay-Boomer-DC
- LAN segment gateway: 10.11.11.1
- LAN segment broadcast address: 10.11.11.255
The task is to review the PCAP and answer the following questions:
- What operating system and type of device is on 10.11.11.94?
- What operating system and type of device is on 10.11.11.121?
- Based on the MAC address for 10.11.11.145, who is the manufacturer or vendor?
- What operating system and type of device is on 10.11.11.179?
- What version of Windows is being used on the host at 10.11.11.195?
- What is the user account name used to log into the Windows host at 10.11.11.200?
- What operating system and type of device is on 10.11.11.217?
- What IP is the Windows host that downloaded a Windows executable file over HTTP?
- What is the URL that returned the Windows executable file?
- What is the SHA256 file hash for that Windows executable file?
- What is the detection rate for that SHA256 hash on VirusTotal?
- What public IP addresses did that Windows host attempt to connect over TCP after the executable file was downloaded?
- What is the host name and Windows user account name used on that IP address?
Initial overview analysis
Getting familiar with the capture and extraction of key characteristics is the basis of any analysis of network traffic data. We should know what types of connections are there, identify local IP addresses and provided services, and more. In the following examples, we will try to get some such information.
Question 1: What operating system and type of device is on 10.11.11.94?
To get all available information about a given host, open the Search child window, select the query "Hosts info in given network range (CIDR)", and insert the host's address. The visualization then shows a Host node and two connected nodes with host data (Software, and User_Agent) containing information about detected software and user agent. When we click the User-Agent node, we can see the browser user agent (Mozilla/5.0 (X11; CrOS x86_64 12239.92.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.136 Safari/537.36), which can be processed, for example, by CyberChef.
Question 2: What operating system and type of device is on 10.11.11.121?
We repeat the same procedure as for the previous question. The difference is that we see two User-Agent and two Software nodes this time. So we need to determine which will give us more accurate information about the host.
This information can be easily found by selecting nodes and fetching their neighbors via the context menu. The result of each User_Agent node is a single node with application data from HTTP traffic. One node contains only a connectivity check, whereas the second one information about a regular web page. The relevant user agent is the one with the regular page traffic. The string Mozilla/5.0 (Linux; Android 9; SAMSUNG SM-N950U) AppleWebKit/537.36 (KHTML, like Gecko) SamsungBrowser/10.1 Chrome/71.0.3578.99 Mobile Safari/537.36 can be analyzed again using, for example, by CyberChef.
Question 3: Based on the MAC address for 10.11.11.145, who is the manufacturer or vendor?
MAC addresses are not currently processed and cannot be analyzed. However, if necessary, it is not a problem to add this functionality.
Question 4: What operating system and type of device is on 10.11.11.179?
In this case, the result is similar to Question 2. Again we see two User_Agent nodes. When we fetch the neighboring nodes, we see that one is only related to OSCP (Online Certificate Status Protocol), while the other is related to regular web browser traffic. The resulting string can again be analyzed using, for example, by CyberChef.
Question 5: What version of Windows is being used on the host at 10.11.11.195?
In this case, it is sufficient to repeat the same approach as in Question 1 and analyze the result again in CyberChef.
Question 6: What is the user account name used to log into the Windows host at 10.11.11.200?
User information is typically part of domain-related communications within the local network. Therefore, we start the analysis using the query "Connections by hosts and/or time", where we specify the hosts' address as the Originator address and the local network range as the Responder address, and times stay unfilled. Since we expect more connections, we choose Timeline clustering to avoid overloading the visualization.
The query results are two clusters with Connection nodes next to two Host nodes. To find out the contents of the application data, we need to select both Connection clusters and retrieve the Application data via the context menu. Again, we choose to cluster to one cluster because we expect a larger amount of data.
In the Detail child window of the cluster containing Application data, we can see that it comprises Kerberos, Dns, and Ntp nodes. On the Data tab of the details, we can see that the column kerberos.client shows that the account name we are looking for is brandon.gilbert.
Question 7: What operating system and type of device is on 10.11.11.217?
In this case, it is sufficient to repeat the same approach as in Question 1 and analyze the result again in CyberChef.
Incident investigation analysis
Once we know all the essentials, it's time to look at the incident and find out what happened.
Question 8: What IP is the Windows host that downloaded a Windows executable file over HTTP?
The first step is to search for all nodes referencing an executable file. From the Detailed diagram, we know that the data contains File nodes with mime_type value. For executables, the mime type is the typical value application/x-dosexec. To find such nodes, open the Search child window, select the query "Nodes by attribute and value", and insert Attribute file.mime_type and Value application/x-dosexec. The result is one node.
Select the node and fetch neighbors to find out who obtained and provided the file. The result is three added nodes: two Host nodes and one Files node. From the address range, we can determine who downloaded the file. Alternatively, we can enable the display of edge labels in the visualization settings. From the visualization, we can see that the file was downloaded by host 10.11.11.203.
Question 9: What is the URL that returned the Windows executable file?
To answer this question, select the Files node and fetch the neighbors via the context menu. The result is two nodes: Connection and Http. In the attributes of the Http node, we can see the URL used: acjabogados.com/40group.tiff.
Question 10: What is the SHA256 file hash for that Windows executable file?
The SHA256 value is automatically calculated for all files and is available as a node attribute.
Question 11: What is the detection rate for that SHA256 hash on VirusTotal?
Granef is not currently connected to the VirusTotal service and the check must be done manually (see results at VirusTotal). However, if necessary, it is not a problem to add this functionality.
Question 12: What public IP addresses did that Windows host attempt to connect over TCP after the executable file was downloaded?
First, we need to look at the time the file was downloaded. We can see this in the Connection node fetched as part of the solution to the previous question. To find further connections, use the Search child window, select the query "Connections by hosts and/or time", and insert 10.11.11.203 as the Originator address and 11.11.2019 22:22 (original connection timestamp: 2019-11-11T22:22:58.60402Z) as From field. In the resulting graph, we then see that in addition to the hosts from the local network, the host was connected to two external hosts: 5.188.108.58 and 138.201.6.195.
Question 13: What is the host name and Windows user account name used on that IP address?
We will use the same approach as for Question 6 to answer this question. From the kerberos.client attribute, we can see the hostname tucker-win7-pc and user account candice.tucker.









