8/15/2023 0 Comments Splunk tstats exampleBest correlation fieldsĮach of the examples above use the session_id field as the correlation point. Then we use the stats command to filter and aggregate similar to the previous techniques. URL data and Traffic data are pulled in one tstats command, so there is only one round trip to the summary data. prefix which is required when using tstats with Palo Alto Networks logs. By default it will pull from both which can significantly slow down the search. We use summariesonly=t here to force | tstats to pull from the summary data and not the index. This is much faster than using the index. The | tstats command pulls from the accelerated datamodel summary data instead of the raw data in the index. Note that generating search commands must be preceded with a 'pipe' | symbol as in the example. In this example, we use a generating command called tstats. | tstats summariesonly=t count, values(log.bytes_in) AS log.bytes_in, values(log.bytes_out) AS log.bytes_out, values(log.dest_name) AS log.dest_name FROM datamodel="pan_firewall" WHERE nodename="" OR nodename="log.url" GROUPBY log.session_id Turn on Datamodel Acceleration for all the Palo Alto Networks datamodels.It contains a full datamodel for all Palo Alto Networks logs which is where we'll pull the logs from. Install the Palo Alto Networks App for Splunk.This technique does not pull from the index, so there are a couple things you need to configure before using it. However, it is also the most complicated search syntax and is completely different than traditional SPL search language, so it takes some getting used to. In fact, it is the only technique we use in the Palo Alto Networks App for Splunk because of the sheer volume of data and just how much faster this technique is over the others. This is by far the fastest correlation technique. Correlation technique 3: Datamodel (tstats) First we group by session_id to perform the correlation, then group by dest_name to build the final table of results we want. This is much faster then a join because we have only one round-trip to pull data from the index and we have eliminated the expensive join operation and replaced it with a stats groupby operation which is faster. In this example, instead of joining two searches (one for URL logs and one for Traffic) we create a single search that pulls both, then use stats to filter and aggregate the data as we like. | stats sum(count) AS count, sum(bytes_in), sum(bytes_out) BY dest_name | stats values(sourcetype), values(dest_name) AS dest_name, count, sum(bytes_in) AS bytes_in, sum(bytes_out) AS bytes_out BY session_id (sourcetype="pan:threat" AND log_subtype="url") OR (sourcetype="pan:traffic" AND log_subtype="end") Here we perform the same search as the 'join' above and get the same results, but without using a 'join'. It's faster than a join because it reduces the number of searches required, but not much faster because it still pulls from the index. This technique is less common, but is very useful. Correlation technique 2: Stats correlation See Best correlation fields below for examples where session ID is not enough to correlate logs. All logs from a specific TCP session will have the same session_id, so it makes a decent correlation point. In this example, we search for all URL logs (which contains the FQDN), then join them with traffic logs generated at the end of a session (which contains the total bytes in and out). | stats count, sum(bytes_in), sum(bytes_out) BY dest_name sourcetype="pan:threat" log_subtype="url" In a pinch they can be used to get a view of the data, but if you're making a dashboard on a larger dataset, they can be pretty expensive. Join and Transaction commands are expensive, but conceptually familiar to most people. Correlation technique 1: Use a 'join' or 'transaction' The goal is to visualize possible data exfiltration by showing the total bytes_out for each FQDN. The URL log has a dest_name field with the FQDN and the Traffic log has a bytes_out field, so we need to correlate them to know how many bytes went out for each FQDN. In other words the last technique is the most efficient for Splunk, but the hardest for a human to read.Įach example will correlate traffic logs and url logs to determine how many bytes have been transferred between each FQDN in the time period. They are listed here in order of increasing search complexity and decreasing time cost. Correlation techniquesĮach of these techniques can be used to perform the same correlation, however, each has a different performance profile. This page includes a few common examples which you can use as a starting point to build your own correlations. In fact, Palo Alto Networks Next-generation Firewall logs often need to be correlated together, such as joining traffic logs with threat logs. A common use of Splunk is to correlate different kinds of logs together.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |