If additional search-time configuration was added to this add-on, it would also be deployed to the search heads. It's main purpose there is to execute the script and collect the data. In our scenario, this add-on would be configured with inputs enabled and deployed to the indexers which have access to the frozen data. Not shown here, but index-time configuration to explicitly extract the timestamp and set line-breaking would be included in the add-on in the "default/nf" file. When activating the input, you can create a copy of the stanza in the nf in the "local" folder The scripted input is configured to send data to the "os" index, set the sourcetype to "indexer_storage_metrics," and run every hour (the "interval" attribute is set to 3600 seconds or one hour). The nf configuration would appear as follows: The nf configuration will be stored in the "default" folder. We'll call the app "acme_TA_indexer_metrics." The script will be stored in the "bin" directory inside of the app. Now that we have our script created, we'll create an add-on to package it and the inputs configuration. This format will be easy to parse for Splunk and will minimize the amount of both index-time and search-time configuration necessary to use the data. You can test the script by executing it at the command line. One example where this is not the case for a default index is the for the "main" index which appears as "defaultdb" on the filesystem. In practice, most index configuration will just use the name of the index for the name of the directory on the filesystem. The script above is formulated to produce a single event per index, which alleviates any search-time manipulation of the data.Īlso, the directory name for the index is being used as the "index_name." We know, however, that this directory could be any name depending on how the index paths are configured in nf. This would have created an event with a tabular format, which then could have been parsed in Splunk at search-time into separate events. One thing to note is that the last "du" command to get the total size of the frozen path, could have been used to get the total size for all the individual indexes. I included verbose comments in the script to help illustrate what it is doing. Set "index_name" to "all" since this is for the entire path.Įcho $CURR_DATE,index_name=all,"$FROZEN_TOTAL_SIZE" This is optional since the events produced above can be aggregated to produce a total amountįROZEN_TOTAL_SIZE=$(du -cms "$FROZEN_PATH"/ | grep 'total' | perl -pe 's/(\d )\stotal/frozen_size_mb=\1/') A timestamp at the beginning of the eventĮcho $CURR_DATE,index_name="$CURR_IDX","$FROZEN_SIZE_MB" #Output the data into a Splunk-friendly event format, which includes: Only the "total" line is used and is then transformed to include a field name "frozen_size_mb"įROZEN_SIZE_MB=$(du -cms "$_dir"/ | grep 'total' | perl -pe 's/(\d )\stotal/frozen_size_mb=\1/') #Use the "du" command to get the size of the directory. #Extract the portion of the path with the index name and store for later useĬURR_IDX=$(echo $_dir | perl -pe 's/\/data\/frozen\/(*)\//\1/') The "_dir" variable is used to store all the paths #iterate through each index in the frozen path. #Capture the current timestamp for use when outputting events I made the comments fairly verbose to help illustrate what is going on. Below is a bash script to collect this data. One of the metrics we wish to obtain is how much space the frozen data is taking up per index and in total. For example, archived data for the "windows" index would be in "/data/frozen/windows/" directory and would contain many frozen buckets. Inside of the "frozen" directory, are directories for each index, which contain frozen buckets. In our sample environment, frozen data is stored on each indexer on the "/data/frozen/" path. Therefore, I will outline a solution for creating a scripted input to send metrics to Splunk which can then be used for reporting. For all other types of data besides frozen, you can get insight on your Splunk data at the index and bucket level by using the "dbinspect" command or apps like "Fire Brigade." However, because frozen data "lives" outside of the world of Splunk, there's no way to get insight on that data via Splunk. In this post, I'd like to visit the "Siberia" of Splunk data or frozen (archived) storage.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |