In this two part series we are going to take Burp Suite Project files as input from the command line, parse them, and then feed them into a testing pipeline.

The series is broken down into two parts:

  1. Getting at the Data (i.e. from the CLI to feeding the pipeline)
  2. 8 Bug Hunting Examples with burpsuite-project-parser (i.e. from the pipeline to testing)

This post is focused on bug hunting examples. Check out the previous post if you haven't already setup the environment.

Command Shortcut

In the previous post we used a long (repetitive) command to print the auditItems from a Burp Suite project file:

java -jar -Djava.awt.headless=true \
-Xmx2G \
--add-opens=java.desktop/javax.swing=ALL-UNNAMED \
--add-opens=java.base/java.lang=ALL-UNNAMED \
~/Downloads/burpsuite_pro_v2022.3.6.jar \
--user-config-file=ONLY_BURP_PROJECT_PARSER.json \
--project-file=2022-06-08.burp \
auditItems

For the sake of brevity, in this post we will replace the long command with a shorter one (e.g. $PARSE_BURP). You will need to make this specific to your environment:

export PARSE_BURP="java -jar -Djava.awt.headless=true -Xmx2G --add-opens=java.desktop/javax.swing=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAMED [INSERT_FULL_PATH]/burpsuite_pro_v2022.3.6.jar --user-config-file=[INSERT_FULL_PATH]/ONLY_BURP_PROJECT_PARSER.json --project-file=[INSERT_FULL_PATH]/[PROJECT_FILE].burp"

Then we can print all of the auditItems with:

$PARSE_BURP auditItems

8 Bug Hunting Examples with burpsuite-project-parser

⛅ This list does not try to be comprehensive. Smarter people than me have done much better work mind mapping bug hunting techniques. In fact, if anything these are incomplete. They are meant as starting points in taking input from a Burp Suite Project file to "looking for a bug or testing for a state" (i.e. pipeline). If your feeling is "I could do this better" you are probably right ha. Take what works for you and leave the rest 😊.

1. Base Case

In the base case burpsuite-project-parser proxyHistory will print the entire request (i.e. URL, headers, etc.) and response (headers, body, etc.) as JSON. For example:  

$PARSE_BURP proxyHistory 2>/dev/null | grep -F "{" | head -n 2

...
...

{"Message":"Loaded project file parser; updated for burp 2022."}
{"request":{"url":"http://detectportal.firefox.com:80/success.txt","headers":["Host: detectportal.firefox.com","User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:80.0) Gecko/20100101 Firefox/80.0","Accept: */*","Accept-Language: en-US,en;q\u003d0.5","Accept-Encoding: gzip, deflate","Cache-Control: no-cache","Pragma: no-cache","Connection: close"],"uri":"/success.txt","method":"GET","httpVersion":"HTTP/1.1","body":""},"response":{"url":"http://detectportal.firefox.com:80/success.txt","headers":["Content-Type: text/plain","Content-Length: 8","Last-Modified: Mon, 15 May 2017 18:04:40 GMT","ETag: \"ae780585fe1444eb7d28906123\"","Accept-Ranges: bytes","Server: AmazonS3","X-Amz-Cf-Pop: ORD53-","X-Amz-Cf-Id: ADZK","Cache-Control: no-cache, no-store, must-revalidate","Date: Mon, 14 Sep 2020 17:59:54 GMT","Connection: close"],"code":"200","body":"success\n"}}

You will notice later on that we pipe this result to jq to get more specific with our query. For example, "give me only the URL from the JSON request" : | jq -c '{"url":.request.url}'). Although we could grep all of the requests and responses, the chances are we can be more surgical than that.

⚠️ I created an issue in burpsuite-project-parser to filter components from the proxyHistory and siteMap without jq. This should make the tool faster as well. You can follow the issue here and I will update the blog when this is done. ⚠️

2. Search for bug class specific GET Parameters

Like many people I have bug class specific GET parameters I search for (e.g. url= for SSRF). Let's say we wanted to search a Burp Suite project for any request with url= as a GET parameter:

$PARSE_BURP proxyHistory 2>/dev/null \
| grep -F "{" | jq -c '{"url":.request.url}' \
| cut -d\" -f4 | tr -d \" \
| grep -ie "\?url=" -ie "\&url=" 

Example Results:

https://target1:443/pagead/1p-user-list/1057924016/?url=example
https://target1:443/cc.js?engine_key=123Q2K&url=somesite
...

Let's break this first example down a bit.

  1. $PARSE_BURP proxyHistory 2>/dev/null
    --> Parse our project file and output all of the request/response Proxy History as JSON
  2. | grep -F "{" | jq -c '{"url":.request.url}'
    -->  Take the JSON input and grab only the request URLs
  3. |  cut -d" -f4 | tr -d  \"
    -->  Give me the URL only and trim the quotes
  4. grep -ie "?url=" -ie "&url="
    --> Grep for either (-e) "?url=" or "&url" in a case insensitive manner

This should give us a nice list of URLs that contained =url in their GET request.

💡
You can replace the above grep command with any bug class you find interesting. Resources like SecLists are a good start with example dictionaries. There are a lot more out there as well and I think most people curate their own.

3. Create a script to request a page with input from proxy history

Let's say we wanted to take in every URL from our project and perform a scan looking for a specific file (e.g. /.git/config) on that URL. Here is one way to create a script for this using the previous Burp History as input in our pipeline.

$PARSE_BURP proxyHistory 2>/dev/null \
| grep -F "{" | jq -c '{"url":.request.url}' \
| cut -d\" -f4 | tr -d \" \
| cut -d\? -f1 \ 
| xargs -I {} printf "curl {}/.git/config\n"
| tee git_script.sh

You should end up with set of commands in a shell script like:

curl https://target1.com/images/font-awesome-4.2.0/fonts/fontawesome-webfont.woff/.git/config
curl https://target1.com/images/avatar.png/.git/config
curl https://target1.com/some/dir/.git/config
...

Right away you can probably see one of the (many) problems with this. Our "pipeline" is appending to the full URL and not cutting off at the directory. In some cases this might be intended behavior, but chances are it is not.

💡
I will leave it as an exercise to the reader to fix this (hint: rev + cut complement is one way. The solution is also in the next section).
What other problems could there be with doing it this way?
Are these the best settings for curl? Is curl the best tool for this job?

4. Feeding the ffuf monster

ffuf is incredible. Read/watch this brilliant 💎 by @codingo for an overview of ffuf: https://codingo.io/tools/ffuf/bounty/2020/09/17/everything-you-need-to-know-about-ffuf.html.

The previous idea of searching for a specific file is better suited for a tool like ffuf. So let's go back to the same page search but with ffuf instead. First, make sure to create a "bruteforce dictionary" with the just /.git/config in it. Then:

$PARSE_BURP proxyHistory 2>/dev/null \
| grep -F "{" | jq -c '{"url":.request.url}' \
| cut -d\" -f4 | tr -d \" \
| rev | cut -d\/ -f2- | rev \
| sort -u --parallel=2G \
| xargs -I {} printf "ffuf -t 40 -r -u \"{}/FUZZ\" -maxtime 60 -v -c -w /tmp/gitc \n" \
| tee ffuf_search.sh

Example results:

ffuf -t 40 -r -u "http://target1/ajax/libs/jquery/1.11.0/FUZZ" -maxtime 60 -v -c -w /tmp/gitc
ffuf -t 40 -r -u "http://target2/FUZZ" -maxtime 60 -v -c -w /tmp/gitc
ffuf -t 40 -r -u "http://target2/images/FUZZ" -maxtime 60 -v -c -w /tmp/gitc
...
  1. | rev | cut -d\/ -f2- | rev \
    --> This is the solution to the previous question; grab the URL up to the directory
  2. | sort -u --parallel=2G
    --> Sort and give only the unique URLs.
  3. xargs -I {} printf "ffuf -t 40 -r -u "{}/FUZZ" -maxtime 60 -v -c -w /tmp/gitc \n"
    --> The ffuf command
💡
What are my assumptions and potential issues with this new technique? How is this inefficient? Is every URL in-scope to your testing? Is the ffuf command correct?

5. Find HTTP Response Headers with nginx

In this example we want to look through a Burp Suite project for any server response header that contains nginx (i.e. Server: Nginx 1.12 ). This can be done with:

$PARSE_BURP responseHeader='.*(Servlet|nginx).*' 2>/dev/null \
| sort -u --parallel=2G

Example Results:

{"url":"https://target1:443/webfonts/fa-solid-900.woff2","header":"Server: nginx/1.12.2"}
{"url":"https://target2:443/","header":"Server: nginx/1.14.0 + Phusion Passenger 6.0.6"}
...

6. Search for an API key with regex - Take 1

In this example we want to search through a Burp Suite Project for a known API key regex. For example, ([^A-Z0-9]|^)(AKIA|A3T|AGPA|AIDA|AROA|AIPA|ANPA|ANVA|ASIA)[A-Z0-9]{12,} (Source: https://github.com/dxa4481/truffleHogRegexes/issues/19) will identify AWS API keys. Here is how we would do that against our project file:

$PARSE_BURP responseHeader='.*([^A-Z0-9]|^)(AKIA|A3T|AGPA|AIDA|AROA|AIPA|ANPA|ANVA|ASIA)[A-Z0-9]{12,}.*' 2>/dev/null 

7. Search for all the API key(s) with regex - Take 2

There are a couple of issues with Take 1 above. First, it has low yield because we are only using a single regex when we could be greedier about it. Second, it's memory intensive and doesn't scale well.

One solution is to use "the save results to MongoDB" feature (i.e. storeData=[MongoDB Host]) and then write a script to search the results. This scales very well and is reusable.

Another solution which is a little messier (and greedier) is to write all of the responses to files and then use an awesome tool like trufflehog (https://github.com/trufflesecurity/trufflehog) to find all the secrets.  That sounds like more fun, let's go with that.

Step 1 is to write all of the HTTP responses from a Burp Suite project file to a directory.

mkdir burp_responses
$PARSE_BURP proxyHistory 2>/dev/null \
| grep -F "{" | jq -c '{"url":.request.url,"body":.response.body}' \
| while read line; do echo $line | tee burp_responses/$(uuidgen | tr -d '-').burp; done
💡
Note, this will print the URL and the response body (only) to a set of files with one request/response per file. If you want to search HTTP request headers, HTTP response headers etc. then you need to adjust or remove the jq filter on line 2.
On my system this command took around 10 minutes to run. A 384Mb project file became 313Mb worth of 105,309 files.
⚠️ This is the first time I have broken ls on my system with a too many files error in a directory 😂 ⚠️

At this point we should have a directory (i.e. burp_responses) filled with thousands of files containing the URL and the response one per file. Lastly run trufflehog over the set of files and look for results.

trufflehog filesystem --directory=burp_responses --no-verification | tee trufflehog_results.txt
💡
For speed and privacy reasons, I chose to set the --no-verification flag on my first pass.
On secondary passes I would likely remove this flag.

8. Search for all the API key(s) with regex - Take 3

Because we already have the HTTP response bodies in files let's use gf by the legend @tomnomnom to search for interesting things. If you are unfamiliar with gf, the core idea is it's a reusable wrapper around grep.

gf comes pre-packaged with a set of great checks; https://github.com/tomnomnom/gf/blob/master/examples/.

Let's run the s3-buckets common gf patterns over our HTTP responses and see if we find anything of interest:

cd burp_responses
gf s3-buckets \
| sort -u --parallel=2G \
| tee -a gf_results.txt

Although it's not as powerful as using trufflehog, it is far superior to take 1. Furthermore, gf makes it easy to write and reuse your own grep checks. Consider this option when reviewing for interesting things at scale.

Concluding Thoughts

We have just skimmed the surface of the automation capabilities. I have lot more ideas (and experience) related to AppSec automation, so stay tuned!