Lockbit analysis

LockBit Encryption

Introduction

In this article, we will talk briefly about the LockBit features and focus on the different parts that have not been fully covered by other analysts like encryption. The LockBit sample used in this analysis has been extracted during an Incident Response. The attack was conducted manually by humans, with the help of Cobalt Strike.

Finally, in the Annexes we will present a way to recover light encrypted stack strings using IDA and the miasm symbolic engine.

General description

The sample was packed, strings "encrypted" using XOR operation with the first byte as key. The sample checks the process privileges and sidestep Windows UAC using the bypass methods in hfiref0x’s UACME #43.

LockBit stopped undesirable services by checking their names against a skip list using OpenSCManagerA, enumerate dependent services using EnumDependentServicesA and terminate them using ControlService with the SERVICE_CONTROL_STOP control code.

Processes are enumerated using CreateToolhelp32Snapshot, Process32First, Process32Next and OpenProcess. If their names appear on the skip list, the process is killed using TerminateProcess.

The ransomware can enumerate shares on the current /24 subnet using EnumShare and networks resources using WNetAddConnection2W and NetShareEnum.

As usual, Windows snapshots are deleted using the following commands:
cmd.exe /c vssadmin delete shadows /all /quiet & wmic shadowcopy delete & bcdedit /set {default} bootstatuspolicy ignoreallfailures & bcdedit /set {default} recoveryenabled No & wbadmin delete catalog -quiet

Below, we will take a closer look at some of its features.

IOCP

To encrypt quickly and efficiently, LockBit uses the Windows I/O Completion Ports mechanisms. This provides an efficient threading model for processing multiple asynchronous I/O requests on a multiprocessor system. Instead of using the traditional API CreateIoCompletionPort and GetQueuedCompletionStatus, they use NtCreateIoCompletion, NtSetInformationFile and NtRemoveIoCompletion.

LockBit starts by creating an I/O completion Port using NtCreateIoCompletion API:

create IOCP

Then, for each file that does not match entries on the folder and file blacklist, it associates the file handle with the I/O completion port using NtSetInformationFile with the information class that is set to FileCompletionInformation:

associate file handle

Then, reading the file using the OVERLAPPED structure will create an I/O completion packet that is queued in first-in-first-out (FIFO) order to the associated I/O completion port:

iocp read file

Later on (in the decryption_thread), instead of calling the “GetQueuedCompletionStatus” to dequeue an I/O completion packet from the specified I/O completion Port, it calls NtRemoveIoCompletion:

NtRemoveIoCompletion

Encryption

Introduction

The encryption is based on two algorithms: RSA and AES. First, an RSA session key pair is generated on the infected workstations. This key pair is encrypted using the embedded attacker's public key and saved on the registry SOFTWARE\LockBit\full. An AES key is generated randomly for each file to encrypt. Once the file is encrypted, the AES key is encrypted using the RSA public session key and appended to the end of the file with the previously encrypted session key.

Encryption detection

To detect encryption algorithms, public crypto Yara rules can be used. On the unpack binary, you will get the following results:

  • Big_Numbers1
  • Prime_Constants_long
  • Crypt32_CryptBinaryToString_API
  • RijnDael_AES_CHAR
  • RijnDael_AES_LONG

As you can see, the RSA algorithm has not been detected. Reading codes close to the AES functions make me grepping the constant -0x4480 using grep.app application which leads me to the library mbedtls.

With the source code (or one close to the original one), we can try to load C Header files in IDA, by doing File -> Load file -> Parse C header files. You may have to set properly the Include directories and the Calling convention in the option compiler box:

compiler dialog box)

Also, it may be needed to delete external includes in the header files like stddef.h to avoid errors Can't open include file stddef.h.

This allows the analysts to easily import structure:

struct drbg ctx

And maybe, provide hints by comparing the structure size (0x140):

drbg init

Encryption Preparation

Like many ransomware, LockBit uses session keys to encrypt the symmetric key that is used to encrypt files. The function that generates session keys is easy to find because it is just after the debug message "Generate session keys" and just before the encryption_thread :)

string generate session keys

Below the function that generates the RSA session keys:

generate session keys

Because the private session key needs to be kept secret, LockBit RSA encrypts it with his embedded public key (named Attacker_Modulus in the code):

import and encrypt session keys using embedded attacker's RSA public key

Finally, the full encrypted session keys are stored in the registry SOFTWARE\LockBit\full while in the SOFTWARE\LockBit\Public the public session key (Modulus and Exponent, respectively 0x100 and 0x3 bytes):

Stored session keys

Files Encryption

As we mentioned in the IOCP chapter, each file marked for encryption is passed to the encryption thread using ReadFile and the OVERLAPPED structure. They added to the original structure a random AES 128 bits key that is generated just before using BCryptGenRandom from bcrypt.dll library:

generate aes key
BCryptGenRandom

In the decryption thread, the packet is dequeued from the specified I/O completion Port, then the AES key is set using aesni_setkey_enc_128 if the processor supports the Advanced Encryption Standard New Instructions (AES-NI) otherwise with the mbedtls_setkey_enc_128 function:

set aes key

The codes that check if the processor supports AES-NI are done earlier, before generating the RSA session keys:

check aesni support

Finally, the file content is encrypted using AES and written back into the file:

file encryption

Files end data

At the end of each file, 0x610 bytes are appended. This data structure contains the required information for decryption:

OffsetDescriptionSize (bytes)
0x0Encrypted AES key0x100
0x100Encrypted RSA session keys0x500
0x600First 0x10 bytes of the attacker's RSA public key0x10

files end data

Decryptor and Decryption

To decrypt a file, the Decryptor needs to:

  1. Import the attacker’s RSA keys
  2. Get and decrypt the session key pair (placed at the end of the file) using the attackers' private key.
  3. Get and decrypt the AES key (placed at the end of the file) used for encryption using the decrypted session key pair.
  4. Decrypt the file using AES and the previous AES key.

The decryptor includes in its resource the full RSA attacker's key (0x483 bytes):

load attackers key

Imports the RSA key:

Import RSA attacker's key

Gets the last 0x510 file bytes and decrypts the first 0x500 to get the RSA session keys:

decrypt session key

From this, it can decrypt the AES key (stored at the end of the file) with the RSA session key and finally the file.

Annexes

Dynamic Stack Strings

LockBit builds his strings in the stack dynamically. For example, the function that compares file names to a blacklist uses stack strings to build the blacklist:

stack strings

Because there are many strings, and their length can be as big as a ransom note and also because it is a classical feature that you may find again in other malware, automating this task may be useful. One solution would have been to create a script using IDA that records write operation on the stack but as we said in the introduction, strings are also built in addition to a XOR routine:

XOR routine

or:

inline xor loop

In many malware, there is often a single function that is used to decrypt the strings. Most of the time, the solution is to get the cross references to this function with its parameter using IDA, and execute the decrypt function by using one of them: + Python implementation + Debugger conditional breakpoint + IDA appcall + x86 Emulation (unicorm, miasm, etc.)

But because stack strings or often inline we won’t be able to use cross references.

One way to do it is by doing static symbolic emulation on a portion of the code by using IDA selection. I did use this way because with dynamic emulation (symbolic or not) you may have to handle corner cases when an instruction is doing read/write memory operation in an undefined memory zone or API call, etc.

I used miasm framework, because it is easy to install (copy the miasm directory to C:\...\IDA\python\), and they already had a good example on their git.

Original miasm example

Taking a simple case:

simple stack string

By running the original miasm script, we get the following output:

IRDst = loc_key_2
@32[EBP_init + 0xFFFFFC2C] = 0x730077
@32[EBP_init + 0xFFFFFC20] = 0x770024
@32[EBP_init + 0xFFFFFC34] = 0x740062
@32[EBP_init + 0xFFFFFC28] = 0x6F0064
@32[EBP_init + 0xFFFFFC30] = 0x7E002E
@16[EBP_init + 0xFFFFFC38] = (EAX_init)[0:16]
@32[EBP_init + 0xFFFFFC24] = 0x6E0069

This gives us a good starting point to implement more features on it.

Modified version

The idea was to modify the script to automatically comment the instruction with the right strings and rename the local variables (see picture above). Below the steps:

1 - Symbolic execution
Do symbolic execution on each instruction and for each write operation in memory, stored:

  • The destination memory expr (@32[EBP_init + 0xFFFFFC2C])
  • The instruction address (EIP)
  • The stack position (to handle case where a local variable is accessed using ESP and not EBP)
def run_symb_exec(symb, start, end):
    """ Execute symbolic execution and record memory writes

    :param symb         : SymbolicEngine
    :param start (int)  : start address
    :param end (int)    : end address
    return data (dict)  : Dictionnary with information needed (instruction offset, data_xrefs, source memory expression, spd value)
    """
    data = dict()
    while True:
        irblock = ircfg.get_block(start)
        if irblock is None:
            break
        for assignblk in irblock:
            if assignblk.instr:
                LOGGER.debug("0x%x : %s" % (assignblk.instr.offset, assignblk.instr))
                if assignblk.instr.args:
                    if assignblk.instr.args[0].is_mem() is True:  # write mem operation
                        data[assignblk.instr.args[0]] = {'to': assignblk.instr.offset, 'd_ref_type': 2, 'expr': assignblk.instr.args[0], 'spd': idc.get_spd(assignblk.instr.offset)}
                if assignblk.instr.offset == end:
                    break
            symb.eval_updt_assignblk(assignblk)
        start = symb.eval_expr(symb.ir_arch.IRDst)
    return data

2 - Concretize destination memory
Replace destination memory expression (@32[EBP_init + 0xFFFFFC2C]) with concrete value (0x0 for EBP and the stack pointer (spd) for ESP), to get a concrete value.

def concretizes_stack_ptr(phrase_mem):
    """ replace ESP or EBP with a concrete value

    :param phrase_mem (dict)
    """
    stack = dict()
    for dst, src in phrase_mem.items():
        ptr_expr = dst.ptr
        spd = src['spd'] + 0x4
        real = expr_simp(ptr_expr.replace_expr({ExprId('ESP', 32): ExprInt(spd, 32), ExprId('EBP', 32): ExprInt(0x0, 32)}))
        if real.is_int():
            stack[real.arg] = src
    return stack

3 - Evaluate the expression
Evaluate the destination memory expression to get the result expression. For example:

@128[ESP + 0x488] = {@8[0x423EE0] 0 8, @8[0x423EE0] ^ @128[0x423EE0][8:16] 8 16, @128[0x423EE0][16:128] 16 128}

4 - Translate the resulting expression in python
Using the miasm Translator we can convert the expression in Python code:

{@8[0x423EE0] 0 8, @8[0x423EE0] ^ @128[0x423EE0][8:16] 8 16, @128[0x423EE0][16:128] 16 128} -> (((memory(0x423EE0, 0x1) & 0xff) << 0) | ((((memory(0x423EE0, 0x1) ^ ((memory(0x423EE0, 0x10) >> 8) & 0xff)) & 0xff) & 0xff) << 8) | ((((memory(0x423EE0, 0x10) >> 16) & 0xffffffffffffffffffffffffffff) & 0xffffffffffffffffffffffffffff) << 16))

This will allow us to evaluate the Python code to get the content.

5 - Comment IDA and rename local var
I reproduce the stack frame using a list, and for each string found, I comment IDA to the right offset using the previously stored information and rename the local variable with the right size.

def set_data_to_ida(start, data):
    """ Add comment with the strings, and redefine stack variables in IDA
    """
    size_local_var = idc.get_func_attr(start, idc.FUNCATTR_FRSIZE)
    frame_high_offset = 0x100000000
    frame_id = idc.get_func_attr(start, idc.FUNCATTR_FRAME)
    l_xref = dict()

    if size_local_var == 0:  # TODO : handle when stack size is undefined
        size_local_var = 0x1000  # set stack size to arbitrary value 0x1000

    # represent the stack as a list (initialization)
    stack = [0 for i in range(0, size_local_var)]

    # fill the stack
    for offset, value in data.items():
        if (offset >> 31) == 1:  # EBP based frame
            if offset <= frame_high_offset:
                stack_offset = twos_complement(offset)  # Ex: 0xfffffe00 -> 0x200
                frame_offset = stack_to_frame_offset(stack_offset, size_local_var)  # frame_offset = member_offset = positive offset in the stack frame
                l_xref[frame_offset] = value
                stack[frame_offset] = value['value']
        else:  # ESP based frame
            l_xref[offset] = value
            stack_offset = frame_to_stack_offset(offset, size_local_var)
            frame_offset = offset
            stack[offset] = value['value']
        LOGGER.debug("offset 0x%x, stack_offset 0x%x, frame_offset 0x%x, value %s" % (offset, stack_offset, frame_offset, chr(value['value'])))

    index = 0
    while index < len(stack):
        if stack[index] != 0:  # skip null bytes
            buff = get_bytes_until_delimiter(stack[index::], [0x0, 0x0])  # TODO: adjust auto ascii/utf16 ?

            if buff:
                LOGGER.debug("raw output = %s" % repr(buff))
                ascii_buff = zeroes_out(buff)  # transform to ascii, deletes null bytes

                if buff[1] == 0x0:  # check if strings is utf16 TODO: check if really usefull
                    STRTYPE = idc.STRTYPE_C
                else:
                    STRTYPE = idc.STRTYPE_C

                var_offset = index
                var_name = idc.get_member_name(frame_id, var_offset)
                s_string = "".join(map(chr, ascii_buff))
                LOGGER.info("[+] Index 0x%x, var_offset 0x%x, frame_id = 0x%x, var_name %s, strings = %s, clean_string = %s, len_strings = 0x%x, size_local_var 0x%x, xref to 0x%x" % (index, var_offset, frame_id, var_name, s_string, replace_forbidden_char(s_string), len(buff), size_local_var, l_xref[index]['to']))

                # delete struct member, to create array of char
                LOGGER.info("[+] Delete structure member")
                for a_var_offset in range(var_offset, var_offset + len(buff)):
                    # if idc.get_member_name(frame_id, a_var_offset):
                    ret = idc.del_struc_member(frame_id, a_var_offset)
                    LOGGER.debug("ret del_struc_member = %d" % ret)

                # re add structure member with good size and proper name
                LOGGER.info("[+] Add structure member")
                ret = idc.add_struc_member(frame_id, replace_forbidden_char(s_string)[:0x10], var_offset, idc.FF_STRLIT, STRTYPE, len(buff))
                LOGGER.debug("ret add_struc_member = %d" % ret)

                # Comment instruction using the xref key
                LOGGER.info("[+] Comment instruction")
                if l_xref[index]['d_ref_type'] == 2:
                    if not idc.get_cmt(l_xref[index]['to'], 0):
                        idc.set_cmt(l_xref[index]['to'], s_string, 0)
                index = index + len(buff)
            else:
                index = index + 1
        else:
            index = index + 1
    LOGGER.info("[+] end")
    return stack

Script Output Example

By selecting instructions and running the script, we get instructions automatically commented and local variable renamed:

xor routine output
xor loop inline

Yara

rule malware_first_unpacking_routine {
    strings :
        $h1 = { 81 [1-5] A9 0F 00 00 75 ?? C7 ?? ?? ?? ?? ?? 40 2E EB ED }
    condition :
        $h1
}

IOC

Commands

  • /c vssadmin Delete Shadows /All /Quiet
  • /c bcdedit /set {default} recoveryenabled No
  • /c bcdedit /set {default} bootstatuspolicy ignoreallfailures
  • /c wbadmin DELETE SYSTEMSTATEBACKUP
  • /c wbadmin DELETE SYSTEMSTATEBACKUP -deleteOldest
  • /c wmic SHADOWCOPY /nointeractive
  • /c wevtutil cl security
  • /c wevtutil cl system
  • /c wevtutil cl application
  • /c vssadmin delete shadows /all /quiet & wmic shadowcopy delete & bcdedit /set {default} bootstatuspolicy ignoreallfailures & bcdedit /set {default} recoveryenabled no & wbadmin delete catalog -quiet
  • /C ping 127.0.0.7 -n 3 > Nul & fsutil file setZeroData offset=0 length=524288 "%s" & Del /f /q "%s"

Mutex

  • Global\{BEF590BE-11A6-442A-A85B-656C1081E04C}

Services

  • wrapper
  • DefWatch
  • ccEvtMgr
  • ccSetMgr
  • SavRoam
  • Sqlservr
  • sqlagent
  • sqladhlp
  • Culserver
  • RTVscan
  • sqlbrowser
  • SQLADHLP
  • QBIDPService
  • Intuit.QuickBooks.FCS
  • QBCFMonitorService
  • sqlwriter
  • msmdsrv
  • tomcat6
  • zhudongfangyu
  • vmware-usbarbitator64
  • vmware-converter
  • dbsrv12
  • dbeng8
  • MSSQL$MICROSOFT##WID
  • MSSQL$VEEAMSQL2012
  • SQLAgent$VEEAMSQL2012
  • SQLBrowser
  • SQLWriter
  • FishbowlMySQL
  • MSSQL$MICROSOFT##WID
  • MySQL57
  • MSSQL$KAV_CS_ADMIN_KIT
  • MSSQLServerADHelper100
  • SQLAgent$KAV_CS_ADMIN_KIT
  • msftesql-Exchange
  • MSSQL$MICROSOFT##SSEE
  • MSSQL$SBSMONITORING
  • MSSQL$SHAREPOINT
  • MSSQLFDLauncher$SBSMONITORING
  • MSSQLFDLauncher$SHAREPOINT
  • SQLAgent$SBSMONITORING
  • SQLAgent$SHAREPOINT
  • QBFCService
  • QBVSS
  • YooBackup
  • YooIT
  • svc$
  • MSSQL
  • MSSQL$
  • memtas
  • mepocs
  • sophos
  • veeam
  • backup
  • bedbg
  • PDVFSService
  • BackupExecVSSProvider
  • BackupExecAgentAccelerator
  • BackupExecAgentBrowser
  • BackupExecDiveciMediaService
  • BackupExecJobEngine
  • BackupExecManagementService
  • BackupExecRPCService
  • MVArmor
  • MVarmor64
  • stc_raw_agent
  • VSNAPVSS
  • VeeamTransportSvc
  • VeeamDeploymentService
  • VeeamNFSSvc
  • AcronisAgent
  • ARSM
  • AcrSch2Svc
  • CASAD2DWebSvc
  • CAARCUpdateSvc
  • WSBExchange
  • MSExchange
  • MSExchange$

Process

  • wxServer
  • wxServerView
  • sqlmangr
  • RAgui
  • supervise
  • Culture
  • Defwatch
  • winword
  • QBW32
  • QBDBMgr
  • qbupdate
  • axlbridge
  • httpd
  • fdlauncher
  • MsDtSrvr
  • java
  • 360se
  • 360doctor
  • wdswfsafe
  • fdhost
  • GDscan
  • ZhuDongFangYu
  • QBDBMgrN
  • mysqld
  • AutodeskDesktopApp
  • acwebbrowser
  • Creative Cloud
  • Adobe Desktop Service
  • CoreSync
  • Adobe CEF Helper
  • node
  • AdobeIPCBroker
  • sync-taskbar
  • sync-worker
  • InputPersonalization
  • AdobeCollabSync
  • BrCtrlCntr
  • BrCcUxSys
  • SimplyConnectionManager
  • Simply.SystemTrayIcon
  • fbguard
  • fbserver
  • ONENOTEM
  • wsa_service
  • koaly-exp-engine-service
  • TeamViewer_Service
  • TeamViewer
  • tv_w32
  • tv_x64
  • TitanV
  • Ssms
  • notepad
  • RdrCEF
  • oracle
  • ocssd
  • dbsnmp
  • synctime
  • agntsvc
  • isqlplussvc
  • xfssvccon
  • mydesktopservice
  • ocautoupds
  • encsvc
  • firefox
  • tbirdconfig
  • mydesktopqos
  • ocomm
  • dbeng50
  • sqbcoreservice
  • excel
  • infopath
  • msaccess
  • mspub
  • onenote
  • outlook
  • powerpnt
  • steam
  • thebat
  • thunderbird
  • visio
  • wordpad
  • bedbh
  • vxmon
  • benetns
  • bengien
  • pvlsvr
  • beserver
  • raw_agent_svc
  • vsnapvss
  • CagService
  • DellSystemDetect
  • EnterpriseClient
  • VeeamDeploymentSvc

Registry

  • SOFTWARE\LockBit
  • SOFTWARE\LockBit\full
  • SOFTWARE\LockBit\Public

Persistance

  • HKCU\SOFTWARE\Microsoft\Windows\CurrentVersion\Run\XO1XADpO01

UAC bypass CLSID

  • {3E5FC7F9-9A51-4367-9063-A120244FBEC7}
  • {D2E7041B-2927-42fb-8E9F-7CE93B6DC937}

Full Script

import idaapi
import idc
import logging
from math import log


def unpack_arbitrary(data, word_size=None):
    """ (modified pwntools) unpack arbitrary-sized integer
    :param data     : String to convert
    :word_size      : Word size of the converted integer or the string “all” (in bits).
    :return        : The unpacked number
    """
    number = 0
    data = reversed(data)
    data = bytearray(data)

    for c in data:
        number = (number << 8) + c

    number = number & ((1 << word_size) - 1)

    signbit = number & (1 << (word_size-1))
    return int(number - 2*signbit)


def memory(ea, size):
    """ Read memory from IDA
    :param ea (int)     : Start address
    :param size (int)   : size of buffer in normal 8-bit bytes
    : return int/long
    """
    return unpack_arbitrary(idc.get_bytes(ea, size), size * 8)


def twos_complement(esp_offset):
    return 0x100000000 - esp_offset


def stack_to_frame_offset(stack_offset, size_local_var):
    """
    stack_offset <-> frame_offset
    """
    return size_local_var - abs(stack_offset)


def frame_to_stack_offset(frame_offset, size_local_var):
    """
    stack_offset <-> frame_offset
    """
    return size_local_var - abs(frame_offset)


def get_bytes_until_delimiter(bytes_list, delimiter):
    """Get bytes until it reach the delimiter

    :param bytes_list (list)    : List of bytes
    :param delimiter (list)     : Delimiter to reach (e.g. [0x0, 0x0])
    :return a list of bytes
    """

    len_delimiter = len(delimiter)
    iterables = list()
    for x in range(len_delimiter):
        iterables.append(bytes_list[x:])
    abc = zip(*iterables)
    if tuple(delimiter) in abc:
        index = abc.index(tuple(delimiter))
        return (bytes_list[0:index]) + delimiter
    else:
        return None


def zeroes_out(bytes_list):
    """ delete null bytes

    :param bytes_list (list)    : list of bytes
    :return                     : list of bytes without zeros
    """
    return [x for x in bytes_list if x != 0]


def replace_forbidden_char(my_strings):
    """ replace IDA forbidden char
    """

    clean_string = list()
    for x in my_strings:
        if x.isalnum():
            clean_string.append(x)
        elif x.isspace():
            clean_string.append('_')
        else:
            clean_string.append('?')
    return "k_" + "".join(clean_string)


def bytes_needed(n):
    """ get the number of bytes that compose the number
    https://stackoverflow.com/questions/14329794/get-size-of-integer-in-python

    :pararm n (int) : number
    :return (int)   : number of bytes
    """
    if n == 0:
        return 1
    return int(log(n, 256)) + 1


def extract_byte_x(num, position):
    """ Get bytes from position

    :param num (int)        : original number
    :param position (int)   : desired byte position
    : return (int)          : byte x
    """
    offset = 0x8 * (position - 1)
    and_mask = (0xFF << offset)
    return (num & and_mask) >> offset


def symbolic_exec(start, end):
    """ Symbolic execution engine (modified of original https://github.com/cea-sec/miasm/blob/master/example/ida/symbol_exec.py)

    Takes start and end address from IDA selection, do symbolic execution on instructions
    Replace stacks registers (ESP, EBP) variable by concrete value 0x0 or current spd instructions
    """
    from miasm.ir.symbexec import SymbolicExecutionEngine
    from miasm.expression.expression import ExprId, ExprInt
    from miasm.expression.simplifications import expr_simp
    from miasm.core.bin_stream_ida import bin_stream_ida
    from miasm.ir.translators import Translator
    from miasm.analysis.machine import Machine

    def run_symb_exec(symb, start, end):
        """ Execute symbolic execution and record memory writes

        :param symb         : SymbolicEngine
        :param start (int)  : start address
        :param end (int)    : end address
        return data (dict)  : Dictionnary with information needed (instruction offset, data_xrefs, source memory expression, spd value)
        """
        data = dict()
        while True:
            irblock = ircfg.get_block(start)
            if irblock is None:
                break
            for assignblk in irblock:
                if assignblk.instr:
                    LOGGER.debug("0x%x : %s" % (assignblk.instr.offset, assignblk.instr))
                    if assignblk.instr.args:
                        if assignblk.instr.args[0].is_mem() is True:  # write mem operation
                            data[assignblk.instr.args[0]] = {'to': assignblk.instr.offset, 'd_ref_type': 2, 'expr': assignblk.instr.args[0], 'spd': idc.get_spd(assignblk.instr.offset)}
                    if assignblk.instr.offset == end:
                        break
                symb.eval_updt_assignblk(assignblk)
            start = symb.eval_expr(symb.ir_arch.IRDst)
        return data

    def concretizes_stack_ptr(phrase_mem):
        """ replace ESP or EBP with a concrete value

        :param phrase_mem (dict)
        """
        stack = dict()
        for dst, src in phrase_mem.items():
            ptr_expr = dst.ptr
            spd = src['spd'] + 0x4
            real = expr_simp(ptr_expr.replace_expr({ExprId('ESP', 32): ExprInt(spd, 32), ExprId('EBP', 32): ExprInt(0x0, 32)}))
            if real.is_int():
                stack[real.arg] = src
        return stack

    def eval_src_expr(symb, data):
        """ evaluate/resolve the source expression trying to get a concrete value.
        Translate to python anbd eval if necessary

        Evaluation:
        @128[ESP + 0x488] = {@8[0x423EE0] 0 8, @8[0x423EE0] ^ @128[0x423EE0][8:16] 8 16, @128[0x423EE0][16:128] 16 128}
        Translation:
        {@8[0x423EE0] 0 8, @8[0x423EE0] ^ @128[0x423EE0][8:16] 8 16, @128[0x423EE0][16:128] 16 128} = (((memory(0x423EE0, 0x1) & 0xff) << 0) | ((((memory(0x423EE0, 0x1) ^ ((memory(0x423EE0, 0x10) >> 8) & 0xff)) & 0xff) & 0xff) << 8) | ((((memory(0x423EE0, 0x10) >> 16) & 0xffffffffffffffffffffffffffff) & 0xffffffffffffffffffffffffffff) << 16))
        """
        tmp = dict()
        for dst, src in data.items():
            tmp_val = symb.eval_expr(src['expr'])
            if tmp_val.is_int():  # Case where the source expression is a int
                tmp_val = int(tmp_val.arg)
            else:  # The source expression may by a data reference (read from mem)
                # Translate expression to python
                py_expr = Translator.to_language('Python').from_expr(tmp_val)
                try:
                    tmp_val = eval(py_expr)  # TODO better solution (need to add memory function) ?
                except Exception as e:
                    LOGGER.info("[-] Python Translator Exception")
                    LOGGER.info(py_expr)
                    raise e
            if isinstance(tmp_val, int) or isinstance(tmp_val, long):
                # Set one byte per address
                for i in range(0, bytes_needed(tmp_val)):
                    tmp[dst + i] = {'to': src['to'], 'd_ref_type': 2, 'value': extract_byte_x(tmp_val, i + 1), 'spd': src['spd']}
        return tmp

    bs = bin_stream_ida()
    machine = Machine("x86_32")
    mdis = machine.dis_engine(bs, dont_dis_nulstart_bloc=True)

    ir_arch = machine.ira(mdis.loc_db)

    # Disassemble the targeted function until end
    mdis.dont_dis = [end]
    asmcfg = mdis.dis_multiblock(start)

    # Generate IR
    ircfg = ir_arch.new_ircfg_from_asmcfg(asmcfg)

    # Replace ExprID
    regs = machine.mn.regs.regs_init
    # regs[ExprId('ESP', 32)] = ExprId('ESP', 32)
    # regs[ExprId('ESP', 32)] = ExprId('EBP', 32)
    # regs[ExprId('EAX', 32)] = ExprId('EAX', 32)
    # regs[ExprId('EBX', 32)] = ExprId('EBX', 32)
    # regs[ExprId('ECX', 32)] = ExprId('ECX', 32)
    # regs[ExprId('EDX', 32)] = ExprId('EDX', 32)
    # regs[ExprId('XMM0_init', 32)] = ExprId('XMM0', 32)

    LOGGER.info("[+] Get symbolic engine")
    symb = SymbolicExecutionEngine(ir_arch, regs)
    LOGGER.info("[+] Run symbolic execution")
    data = run_symb_exec(symb, start, end)
    LOGGER.info("[+] Concretize stack pointer")
    data = concretizes_stack_ptr(data)
    LOGGER.info("[+] Resolves source expression")
    data = eval_src_expr(symb, data)
    return data


def set_data_to_ida(start, data):
    """ Add comment with the strings, and redefine stack variables in IDA
    """
    size_local_var = idc.get_func_attr(start, idc.FUNCATTR_FRSIZE)
    frame_high_offset = 0x100000000
    frame_id = idc.get_func_attr(start, idc.FUNCATTR_FRAME)
    l_xref = dict()

    if size_local_var == 0:  # TODO : handle when stack size is undefined
        size_local_var = 0x1000  # set stack size to arbitrary value 0x1000

    # represent the stack as a list (initialization)
    stack = [0 for i in range(0, size_local_var)]

    # fill the stack
    for offset, value in data.items():
        if (offset >> 31) == 1:  # EBP based frame
            if offset <= frame_high_offset:
                stack_offset = twos_complement(offset)  # Ex: 0xfffffe00 -> 0x200
                frame_offset = stack_to_frame_offset(stack_offset, size_local_var)  # frame_offset = member_offset = positive offset in the stack frame
                l_xref[frame_offset] = value
                stack[frame_offset] = value['value']
        else:  # ESP based frame
            l_xref[offset] = value
            stack_offset = frame_to_stack_offset(offset, size_local_var)
            frame_offset = offset
            stack[offset] = value['value']
        LOGGER.debug("offset 0x%x, stack_offset 0x%x, frame_offset 0x%x, value %s" % (offset, stack_offset, frame_offset, chr(value['value'])))

    index = 0
    while index < len(stack):
        if stack[index] != 0:  # skip null bytes
            buff = get_bytes_until_delimiter(stack[index::], [0x0, 0x0])  # TODO: adjust auto ascii/utf16 ?

            if buff:
                LOGGER.debug("raw output = %s" % repr(buff))
                ascii_buff = zeroes_out(buff)  # transform to ascii, deletes null bytes

                if buff[1] == 0x0:  # check if strings is utf16 TODO: check if really usefull
                    STRTYPE = idc.STRTYPE_C_16
                else:
                    STRTYPE = idc.STRTYPE_C

                var_offset = index
                var_name = idc.get_member_name(frame_id, var_offset)
                s_string = "".join(map(chr, ascii_buff))
                LOGGER.info("[+] Index 0x%x, var_offset 0x%x, frame_id = 0x%x, var_name %s, strings = %s, clean_string = %s, len_strings = 0x%x, size_local_var 0x%x, xref to 0x%x" % (index, var_offset, frame_id, var_name, s_string, replace_forbidden_char(s_string), len(buff), size_local_var, l_xref[index]['to']))

                # delete struct member, to create array of char
                LOGGER.info("[+] Delete structure member")
                for a_var_offset in range(var_offset, var_offset + len(buff)):
                    # if idc.get_member_name(frame_id, a_var_offset):
                    ret = idc.del_struc_member(frame_id, a_var_offset)
                    LOGGER.debug("ret del_struc_member = %d" % ret)

                # re add structure member with good size and proper name
                LOGGER.info("[+] Add structure member")
                ret = idc.add_struc_member(frame_id, replace_forbidden_char(s_string)[:0x10], var_offset, idc.FF_STRLIT, STRTYPE, len(buff))
                LOGGER.debug("ret add_struc_member = %d" % ret)

                # Comment instruction using the xref key
                LOGGER.info("[+] Comment instruction")
                if l_xref[index]['d_ref_type'] == 2:
                    if not idc.get_cmt(l_xref[index]['to'], 0):
                        idc.set_cmt(l_xref[index]['to'], s_string, 0)
                index = index + len(buff)
            else:
                index = index + 1
        else:
            index = index + 1
    LOGGER.info("[+] end")
    return stack


def main():
    start, end = idc.read_selection_start(), idc.read_selection_end()

    if start != 0xFFFFFFFF and end != 0xFFFFFFFF:
        data = symbolic_exec(start, end)
        LOGGER.info("[+] Set and comment data to IDA")
        stack = set_data_to_ida(start, data)
    else:
        print("Error: Select instructions")
        stack = -0x1
    return stack


if __name__ == '__main__':
    idaapi.CompileLine('static key_F3() { RunPythonStatement("main()"); }')
    idc.add_idc_hotkey("F3", "key_F3")

    LOGGER = logging.getLogger(__name__)
    if not LOGGER.handlers:  # Avoid duplicate handler
        console_handler = logging.StreamHandler()
        console_handler.setFormatter(logging.Formatter("[%(levelname)s]  %(message)s"))
        LOGGER.addHandler(console_handler)
        LOGGER.setLevel(logging.DEBUG)

    print("=" * 50)
    print("""Available commands:
    main() - F3: Symbolic execution of current selection
    """)