Code Review Like a Pro


Hello everyone!
Welcome to my first blog post! I’m excited to be here :smile:


Today, I’m going to share with you my own research methodology for analyzing and reviewing source code applications to identify vulnerabilities on Whitebox engagements.

Whitebox assessments refer to a specific security testing scenario where the attacker/tester has prior access to the internal workings of the application, including its architecture, schemas or source code files. This access enables them to identify vulnerabilities more quickly and efficiently compared to the blackbox approach, which is what this article will be talking about.

The following article will focus on source code files that we are able to obtain or find using some Recon techniques (which will be explore in a future blog posts).

The skill of finding bugs and weaknesses inside a code that you don’t familiar with and getting into the developer mind’s required to be experienced with code reading - and a lot of it, but I can promise to you, it will get much easier over time as long as you practice and deal with code on your daily basis.

shut_up_please

So let’s get started…

Approaches to perform Code Review:

There are numerous different ways to investigate code, which are:

  • Covering code line by line
  • Focusing on low-hanging fruit functions such as Login, Registration, and Password Reset mechanisms
  • Greping regex keywords for quick-wins
  • Following user input using bottom-up and top-down approaches

Let’s break those things up:

Covering line by line:

In cases where the project we are checking is considered large, this approach will consume a significant amount of time to cover end to end. Therefore, it is more suitable for projects with a small number of code files to cover.

Focus on Low-hanging fruit functions:

This approach involves prioritizing web application mechanisms where the severity of identifying weaknesses and vulnerabilities is higher due to the complexity of requirements, such as validations, checks, and dependencies on other internal functions, thus we will focus on functions such as Login, Register, Forgot Password, Upload Files, and more.

Greping regex keywords for quick-wins:

One of the fastest and quickest ways to achieve success in finding vulnerabilities is by looking after a list of keywords and dangerous functions that could potentially lead to weaknesses. Examples include username, password, token, system, shell_exec move_uploaded_file, file_get_contents, and other equivalent functions.

Note: Don’t forget to adapt your keyword vocabulary to the programming language you’re researching.

Following user input using bottom-up and top-down approaches:

This strategy is based on the functionality of web applications, which rely on user input and are controlled by it in order to work appropriately. For example, variables like $_GET, $_POST, $_COOKIE, and $_REQUEST are commonly used.

The bottom-up and top-down approaches is based on the concepts of sink and source terms. The former refer to any part of the program that may be influenced by external data, while the latter refer any input or external data that enters a system.

The Power of Tools

A Security Researcher without his tools is like a baker without his mixer gentlemens, so we need to discuss on some tools that can save us a lot of time during our research and automate our efforts.

One of the main thoughts that need to be in your head is - ‘How can I save time to make the process more efficient and more focused to achieving better results than the manual process?’

As discussed earlier, the process of covering line by line, code by code is very time-consuming process that we want to avoid. Therefore, we seek to apply the best fit approach presented above to the specific code project we are handle with.

Let’s explore some awesome tools:

  • cloc
  • graudit
  • TruffleHog
  • Driftwood

cloc:

A cool tool that I’ve discovered recently called cloc1 (‘Count Lines of Code’) that quickly assess the files content - such as comments, blank lines, and the actual code count. This tool can give us some first imperssions on the material before we get our hands dirty with the code:

cloc_output

cloc categorized the code content by its counts

Note: The ‘cloc –show-lang’ command will shows us the code languages that supported by this tool:

ABAP (abap)
ActionScript (as)
Ada (ada, adb, ads, pad)
ADSO/IDSM (adso)
Agda (agda, lagda)
AMPLE (ample, dofile, startup)
Ant (build.xml, build.xml)
ANTLR Grammar (g, g4)
Apex Class (cls)
Apex Trigger (trigger)
APL (apl, apla, aplc, aplf, apli, apln, aplo, dyalog, dyapp, mipage)
Arduino Sketch (ino, pde)
AsciiDoc (adoc, asciidoc)
ASP (asa, ashx, asp, axd)
..

graudit:

Another tool that can save us time during our research is the graudit2 (‘grep rough audit’) tool. This tool is signature-based that utilizes a large dataset of popular programming codes. Behind the scenes, it runs predefined keywords using the grep utility with severity to identify vulnerabilities in a code section:

graudit_output

graudit detected some potential vulnerable code requiring further examination

An output results of this tool could be a line focus on a function that depends on user input or an SQL statement executed without passing through a sanitation check before, or for god’s sake - existence of command execution functions that controlled by the end user:

how_rce_are_born

TruffleHog

TruffleHog3 is a powerful open-source tool designed to detect secrets and leaked credentials that might be accidentally or intentionally left behind within Git repositories or raw filesystems. It achieves this by scanning all commit histories to identify high-entropy strings, which may indicate the presence of API keys, passwords, and other types of credentials.

As discussed previously, our goal is to identify the quickest wins and expose the crown jewel assets, and this tool comes to help us achieving this.

crown_jewel_pic

In the picture below, while running the tool against a GitHub repo, we can see the tool is able to find an exposed SSH private key, an API key of the PagerDuty asset, and even a MongoDB database connection string:

trufflehog_secrets_output

By the way, there is also a cool TruffleHog extension specifically built for the Google Chrome4 that monitors API keys and credentials on websites we visit. Once the extension found one, it immediately alerts with a popup on the screen:

trufflehog_chrome_plugin

Secrets revealed without the need to dig into the source code

Driftwood

Additional tool from the same open-source creators is Driftwood5, which continues the process of TruffleHog’s tool output once a private key is found.

Many times, we come a across hard-coded private SSH/TLS keys, and we’re not always certain about which asset they belong to. Private keys have a small number of use cases that typically used for TLS and SSH protocols.

So, the purpose of this tool is to take a given Private Key, derive its Public Key component, and then perform lookups on a large database set of known exposed public keys in order to determine its belonging asset:

driftwood_output_poc

Found 2 matches of TLS certificates to a given Private Key

You can read more about here to figure out how it really works behind the scenes.

Wrapping Up

The ideas and techniques presented in this article are just the tip of the iceberg in becoming a Code Reviewer master. I hope you learned one or two things you didn’t knew before.

For those who want to take this a step further, I highly recommend the OSWE6 course by Offensive Security to gain additional hands-on practice covering all the necessary knowledge and techniques to achive this goal.


Thank you for reading!


Disclaimer: This material is for informational purposes only, and should not be construed as legal advice or opinion. For actual legal advice, you should consult with professional legal services.