CROSS-REFERENCES AND GRAPHING : Différence entre versions

De Wiki expérimental
(Page créée avec « Some of the more common questions asked vhile reverse engineering a hinary are along the unes of "Where is this function called from?" and "What fonctions access this data... »)
 
 
(4 révisions intermédiaires par le même utilisateur non affichées)
Ligne 35 : Ligne 35 :
  
 
An ordinaryflowis the sirnplest flow type, and it represents sequential flow from one instruction to another. This is the default execution flow for A nonbranching instructions such as ADD. There are no special dïsplay mdicators for ordinary fiows other than the order in whïch instructions are hsted in the disassernbly. If instruction A has an ordinary flow to instruction B, then instruction B wili imrnedïateiy foilow instruction A in the disassernbiy listing. In the following listing, every instruction other than O and O has an associated ordinary flow to lis immediate successor:
 
An ordinaryflowis the sirnplest flow type, and it represents sequential flow from one instruction to another. This is the default execution flow for A nonbranching instructions such as ADD. There are no special dïsplay mdicators for ordinary fiows other than the order in whïch instructions are hsted in the disassernbly. If instruction A has an ordinary flow to instruction B, then instruction B wili imrnedïateiy foilow instruction A in the disassernbiy listing. In the following listing, every instruction other than O and O has an associated ordinary flow to lis immediate successor:
 +
 +
Instructions used to invoke fonctions, such as the x86 cali instructions at O, are assigned a rail flow, indicating transfer of control to the target [une-lion. In most cases, an ordinary flow is also assigned to cali instructions, as most fonctions return to the location that foliows the cali. If IDA believes that a fonction dors not return (as deterrnïned during the analysis phase), then calis to that fonction will not have an ordinary flow assigned. Cali fiows are noted by the display of cross-references at the target fonction (the destination address of the flow). The resulting disassembly of the callflow four-lion is shown here:
 +
 +
In this example, two cross-references are displayed at the address of callflow to indicate that the fonction is called twïce. The address displayed in the cross-references is displayed as an offset into the calling fonction unless the calling address bas an associated riante, in which case the riante is used. Both forms of addresses are used in the cross-references shown here. Cross-references resulting front fonction rails are distïnguïshed through use of the p sufflx (thïnkP for Procedure).
 +
Ajwnpflowïs assigned to each unconditional and conditional branch instruction. Conditional branches are also assigned ordinary fiows to accourut for control flow when the branch is not taken.
 +
Unconditional branches have no associated ordinary flow because the branch is aiways taken in such cases. The dashed une break at O is a display device used to indicate that an ordinary flow dors not exist between two adjacent instructions. Jump fiows are associated withjurnp-style cross-references displayed at the target ofthejump, as shown at 0. As with cali-style cross-references, jurnp cross-references dis-play the address of the referring location (the source of the jurnp). Jump cross-references are distinguished by the use of a j suffix (think Jfor Jurnp.
 +
 +
Data Cross-11e ferences
 +
 +
Data cross-references are used to track the manner in whïch data is accessed within a binary. Data cross-references can be associated with any byte in an IDA database that is associated with a virtual address (in other words, data cross-references are neyer associated with stack variables). The three most commonly encountered types of data cross-references are used to indicate
 +
when a location is being read, when a location is being written, and when the address of a location is being taken. The global variables associated with the previous example program are shown here, as they provide several examples of data cross-references.
 +
 +
A readcmss-referenceis used to indicate that the contents of  meniory location are being accessed. Read cross-references can originate only front an instruction address but may refer to any prograni location. The global variable read _it is read at locations niarked O in Listing 9-1. The assocïated crossreference continents shown in this listing indicate exactly whïch locations in nain are referencing read_it and are recognizable as read cross-references based on the use of the r suffix. The fïrst read perforrned on read _it is a 32- bit read into the ECX register, whïch leads IDA to format read it as a dword (dd). In general IDA takes as niany cues as it possibly can in order to determine the size and/or type of variables based on how they are accessed and how they are used as parameters to fonctions.
 +
 +
The global variable write _it is referenced at the locations marked O in Listing 9-1. Associated wrte cmss-re/'erences are generated and displayed as comments for the write it variable, indicating the program locations that modïfy the contents of the variable. Write cross-references utilize the w suffix. Here again, IDA bas determined the size of the variable based on the fact that the 32-bit EAX register is copïed into write _it. Note that the lïst of crossreferences displayed at write_it terminales wïth an ellipsis (C above), mdicating that the nuniber of cross-references to write _it exceeds the current display lirnit for cross-references. This limit can be modïfied through the Nom-ber of displayed xrefs setting on the Cross-references tab in the Options General dialog. As wïth read cross-references, write cross-references can originate only froni a prograni instruction but may reference any program location. Generally speaking, a write cross-reference that targets a program instruction byte is indicative of self-modifyïng code, whïch is usually considered bad form and is frequently encountered in the de-obfuscation routines used in malware.
 +
 +
The third type of data cross-reference, an offset cross-reference, indicates that the address of a location is being used (rather than the content of the location). The address of global variable ref_it is taken at location O in Listing 9-1, resulting in the offset cross-reference comment at re-f_it in the previous listing (suffix o). Offset cross-references are coninionly the result of pointer operations eïther in code or in data. Array access operations, for example, are typically implemented by adding an offset to the starting address of the array. As a result, the fïrst address in most global arrays can
 +
often be recognized by the presence of an offset cross-reference. For this reason, most string data (strings being arrays of characters in C/C++) is the target of offset cross-references.
 +
Unlike read and write cross-references, which can originate only from instruction locations, offset cross-references can originate from either instruction locations or data locations. An example of an offset that can originate from a programs data section is any table of pointers (such as a vtable) that resuits in the generation of an offset cross-reference from each location within the table to the location being pointed to by those locations. You can sec this if you examine the vtable for class subclass from Chapter 8, whose disassembly is shown here:
 +
 +
Here you see that the address of the vtable is used in the fonction Subclass: :SubClass(void), which is the class constructor. The header lines for fonction SubClass: :v-func3(void), shown here, show the offset crossreference that links the fonction to a vtable.
 +
 +
This example demonstrates one of the characterïstïcs of C++ virtual fonctions that becomes quite obvious when combined wïth offset cross-references, namely that C++ virtual fonctions are neyer called dïrectly and should neyer be the target of a cail cross-reference. Instead, ail C++ virtual fonctions should be referred to by at least one vtable entry and should always be the target of at least one offset cross-reference. Remember that overriding a virtual fonction is not mandatory. Therefore, a virtual fonction can appear in more than one vtable, as discussed in Chapter 8. Backtracking offset cross-references is one technique for easily locating C++ viables in a programs data section.
 +
 +
Cross-reference Lists
 +
 +
With an understandïng of what cross-references are, we can now discuss the manner in which you may access ail of this data within IDA. As mentïoned previously, the number of cross-reference comments that can be dispiayed at a given location is limïted by a configuration setting that defauits to 2. As long as the number of cross-references to a location dues not exceed this limit, then working wïth those cross-references is faïrly straightforward. Mousing over the cross-reference text displays the dïsassembiy of the source region in a tool tip-style display, whïie double-clicking the cross-reference addressjumps the disassembiy wïndow to the source of the cross-reference.
 +
 +
There are two methods for viewing the complete list of cross-references to a location. The first method is to open a cross-references subvïew associated with a specific address. By positioning the cursor on an address that is the target of one or more cross-references and selecting Vïew > Open Subviews > Cross-References, you can open the complete list of crossreferences to a given location, as shown in Figure 9-3, which shows the complete list of cross-references to variable write_it.
 +
 +
The columns of the window indicate the direction (Up or Down) to the source of the cross-reference, the type of cross-reference (using the type suffixes discussed prevïously), the source address of the cross-reference, and the corresponding disassembled text at the source address, including any comments that may exist at the source address. As with other windows that display lïsts of addresses, double-clicking any entry repositions the disassernbly display to the corresponding source address. Once opened, the cross-reference dis-play window rernains open and accessible via a title tab dïsplayed along with every other open subview's title tab above the disassernbly area.
 +
 +
The second way to access a list of cross-references is to highlight a name that you are interested in learning about and choose Jump > Jump to xref (hotkey CTRL-X) to open a dialog that lïsts every location that references the selected symbol. The resulting dialog, shown in Figure 9-4, is nearly identical in appearance to the cross-reference subview shown in Figure 9-3. In this case, the dialog was actïvated using the CTRL-X hotkey with the first instance ofwriteit (.text :0040102B) selected.
 +
 +
The primary difference in the two displays is behavioral. Being a modal dialog,' the display in Figure 9-4 bas buttons to interact with and terminate
 +
the dialog. The primary purpose of this dialog is to select a referencing location and jump toit. Double-clicking one of the listed locations dismisses the dialog and repositions the disassembly window at the selected location. The second difference between the dialog and the cross-reference subview is that the former can be opened using a hotkey or context-sensitive menu from any instance of a symbol, while the latter can be opened only when you position the cursor on an address that is the target of a cross-reference and choose View > Open Subviews > Cross-References. Another way of thinking about il is that the dialog can be opened at the source of any cross-reference, while the subview can be opened only at the destination of the cross-reference.
 +
 +
An example of the usefulness of cross-reference lists might be to rapidly locale every location from which a particular fonction is called. Many people consider the use of the C strcpy2 fonction to be dangerous. Using crossreferences, locating every cail to strcpy is as simple as flnding any one calI to strcpy, using the CTRL-X hotkey to bring up the cross-reference dialog, and working your way through every calI cross-reference. If you dont want to take the time to find strcpy used somewhere in the binary, you can even get away with adding a comment with the text strcpy in il and activating the crossreference dialog using the comment.3
 +
 +
Function Calis
 +
 +
A specialized cross-reference listing dealing exclusively with fonction rails is available by choosing View > Open Subviews > Fonction CaIls. Figure 9-5 shows the resulting dialog, whïch lïsts all locations that calI the current four-lion (as defïned by the cursor location at the time the view is opened) in the upper hall of the window and ail calls made by the current fonction in the lower hall of the window.
 +
 +
Here agaïn, each lïsted cross-reference can be used to quïckly reposition the dïsassembly listing to the corresponding cross-reference location. Restrïcting ourselves to considering fonction rail cross-references allows us to think about more abstract relationships than simple mappings from one address to
 +
another and ïnstead consider how fonctions relate to one another. In the next section, we show how IDA takes advantage of ibis by providing severai
 +
types of graphs, ail designed to assist you in interpreting a binary.
 +
 +
IDA Graphing
 +
 +
Because cross-references relate one address to another, they are a naturai place to begin if we want to make graphs of our binaries. By restricting ourselves to speciflc types of cross-references, we cari derive a number of useful graphs for anaiyzing our binaries. For starters, cross-references serve as the edges (the unes that connect points) in our graphs. Depending on the type of graph we wish to generate, individual nodes (the points in the graph) can be individuai instructions, groups of instructions calied basic blocks, or entire fonctions. IDA bas two distinct graphing capabiiïtïes: an externai graphing capabiiïty utiiïzing a bundled graphing application and an integrated, interactive graphing capabihty. Both of these graphing capabilities are covered in the foliowing sections.
 +
 +
IDA External (Third-Party) graphing
 +
 +
IDAs external graphing capability utilizes third-party graphing applications to dispiay IDA-generated graph flics. For Windows versions prior to 6.1, IDA ships with a bundied graphing application named wingraph32.4 For IDA 6.0, non-Windows versions of IDA are configured to use the dotty5 graph viewer by default. Beginning with IDA 6. 1, aIl versions of IDA ship with and are configured to use the qwingraph6 graph viewer, which is a cross-platform Qt port of wingraph32. Whïle the dotty configuration options remain visible for Linux users, they are commented out by default. The graph viewer used by IDA may be configured by editing the GRAPH_VISUALIZER variable in <IDADIR.>/cfg/ida. cfg.
 +
 +
Whenever an external-style graph is requested, the source for the graph is generated and saved to a temporary file; then the desïgnated third-party graph viewer is launched to dïsplay the graph. IDA supports two graph specification ianguages, Graph Description Language' (GDL) and the DOT' language utïlïzed by the graphviz9 project. The graph specification language used by IDA may be configured by editing the GRAPH_FORNIAT variable in dDADIR>/ cfg/k/a.cfg. Legai values for this variable are DOT and GDL. You must ensure that the language you specify here is compatible with the viewer you have specified in GRAPHVISUALIZER.
 +
 +
Five types of graphs may be generated [rom the View > Graphs submenu Available external mode graphs include the following:
 +
 +
Fonction flowchart
 +
• Cail graph [or the entire binary
 +
• Graph of cross-references to a symbol
 +
• Graph of cross-references [rom a symbol
 +
• Customized cross-re[erence graph
 +
For two of these, the flowchart and the cail graph, IDA is capable o[generating and saving GDL (flot DOT) files for use independently of IDA. These options may be found on the File > Produce file submenu. Saving the specification file for other types of graphs may be possible if your configured graph viewer allows yen to save the currently displayed graph. A number of limitations exïst when dealing with any external graph. First and foremost is the fact that external graphs are not interactive. Manipulation of displayed external graphs is limited by the capabilities of your chosen external graph viewer (often only zoorning and panning).
 +
 +
BASIC BLOCKS
 +
 +
In a computer program, a basic black isa grouping o[ one or more instructions with a single entry to the beginning of the block and a single exit [rom the end of the block. In germai, amer than the last instruction, every instruction within a basic block transfers contrat to exactly one successar instruction within the block. Similarly, other thon the [irst instruction, every instruction in a basic block receives contrat from exactly one prcdcccssor instruction within the block. For the purposes of basic block determination, the tact that function calI instructions trans[er contrat outside the current fonction is generally ignored unless il is known that the function being called [ails to return normally. An important behavioral characteristic o[ basic blocks is that once the tirst instruction in a basic block is executed, the remainder of the block is guaranteed to execute to completion. This cru factor signi[icantly into runtime instrumentation of a program, since il is no longer necessary to set a breakpoint on every instruction in a program or even single sein the program in order to record which instructions have executed. Instead, breakpoints con be set on the first instruction of each basic block, and as each breakpoint is huit, every instruction in its associated block cou be marked as executed. The Process Stalker comportent of Pedram Amini's PaiMei* tramework performs in exactly mis marner.
 +
 +
*PIease see htlp.'//pedrarn.redhive.corn/code/pairnei/.
 +
 +
External Flowcharts
 +
 +
With the cursor posïtioned within a function, View > Graphs > Flow Chart (hotkey F12) generates and dïsplays an external flowchart. The flowchart display is the external graph that most closely resembles IDAs integrated graph-based dïsassembly view. These are not the flowcharts you may have been taught during an ïntroductory programming class. Instead, these
 +
graphs mïght better be named "control flow graphs," as they group a fonctions instructions into basic blocks and use edges to indicate flow [rom one block to another.
 +
 +
Figure 9-6 shows a portion of the flowchart of a relatively simple four-lion. As you can sec, external flowcharts offer very littie in the way o[address information, which can make it difficuit to correlate the flowchart view to ils corresponding disassembly listing.
 +
Flowchart graphs are derived by following the ordinary andjump fiows for each instruction in a function, beginning with the entry point to the function.
 +
 +
External Cali Craphs
 +
 +
A function calI graph is useful for gaining a quick understanding of the hier -archy of function calis made within a program. Cali graphs are generated by creating a graph node for each function and then connecting function nodes based on the existence of a cail cross-reference from one function to another. The process of generating a cali graph for a single function can be viewed as a recursive descent through ail of the functions that are calied from the initial function. In many cases, it is sufficient to stop descending the cail tree once a library function is reached, as it is easïer to learn how the library function operates by reading documentation assocïated wïth the library rather than by attempting to reverse engïneer the compiled version of the function. In fact, in the case of a dynamicaily linked binary it is ont possible to descend into library fonctions, since the code for such fonctions is ont present within the dynamicaily linked binary. Statïcally linked binaries present a different challenge when generating graphs. Since staticaily linked binaries contaïn ail of the code for the libraries that have been linked to the program, related function calI graphs can become extremely large.
 +
 +
In order to dïscuss function cail graphs, we make use of the following trivial program that dues nothing other than create a simple hïerarchy of function calls:
 +
#include <stdio.h>
 +
void depth_21() {
 +
printf(inside depth21\n")
 +
void depth_22() {
 +
fprintf(stderr, "inside depth_2_2\n")
 +
void depth_1() {
 +
depth_2_1Q
 +
depth_2_2Q
 +
printf(inside depth 1\n" )
 +
int main() { depth_1Q;
 +
 +
After compiling a dynarnically linked binary using GNU gcc, we can ask IDA to generate a function cail graph using View > Graphs > Fonction CalIs, which should yïeld a graph sirnilar to that shown in Figure 9-7. In ibis instance we have truncated the left side of the graph sornewhat in order to offer a bit more detail. The calI graph associated with the main function can be seen within the circled area in the figure.
 +
 +
Alert readers may notice that the compiler bas substituted calis to pats and fwrite for printf and fprintf, respectively, as they are more efficient when printing static strings. Note that IDA utilizes different colors to represent different types of ondes in the graph, though the colors are not configurable in any way.'
 +
 +
Given the straïghtforward nature of the prevïous program listing, why dues the graph appear to be twice as crowded as it should be? The answer is that the compiler, as virtually ail compilers do, bas inserted wrapper code responsible for library initialization and termination as well as for configuring parameters properly prior to transferring control to the main fonction.
 +
Attempting to graph a statically linked version of the same program results in the nasty mess shown in Figure 9-8.
 +
 +
The graph in Figure 9-8 demonstrate a behavior of external graphs in general, namely that they are always scaled ïnitïally to display the entire graph, whïch can result in very cluttered displays. For this particular graph, the status bar at the bottom of the WinGraph32 window indicates that there are 946 nodes and 10,125 edges that happen to cross over one another in 100,182 locations. Other than demonstrating the complexïty of statically linked binaries, this graph is ail but unusable. No arnount of zoorning and panning will simplïfy the graph, and beyond that, there is no way to easily locate a specific function such as main other than by reading the label on each noce. By the time you have zoomed in enough to be able to read the labels assocïated wïth each code, only a few dozen ourles will fit within the display.
 +
 +
Two types of cross-reference graphs can be generated for global symbols (fonctions or global variables): cross-references to a symbol (View > Graphs Xrefs To) and cross-references from a symbol (View > Graphs > Xrefs From). To generate an Xrefs To graph, a recursive ascent is performed by backtracking ail cross-references to the selected symbol until a symbol to which no other symbols refer is reached. When analyzing a binary, you can use an Xrefs To
 +
 +
graph to answer the question, "What sequence of cails must be made to reach this fonction?" Figure 9-9 shows the use of an Xrefs To graph to dispiay the paths that can be foiiowed to reach the puts fonction.
 +
 +
Sïmiiariy, Xrefs To graphs can assïst you in visualizing ail of the locations that reference a global variable and the chain of fonction rails required to reach those locations. Cross-reference graphs are the only graphs capable of incorporating data cross-reference information.
 +
 +
In order to create an Xrefs From graph, a recursive descent is performed by foilowing cross-references from the selected symbol. If the symbol is a fonction name, only calI references from the fonction are foliowed, so data references to global variables do not show up in the graph. If the symbol is an inïtialized global pointer variable (meaning that it actuaily points to something), then the corresponding data offset cross-reference is foliowed. When you graph cross-references from a fonction, the effective behavior is a four-lion caii graph rooted at the selected fonction, as shown in Figure 9-10.
 +
 +
Unfortunately, the same cluttered graph problems exist when graphing fonctions with a complex cali graph.
 +
 +
Custom Cross-Reference Craphs
 +
 +
Custom cross-reference graphs, called tiser xref chartsin IDA, provide the maximum fiexibility in generating cross-reference graphs to suit your needs. In addition to combining cross-references to a symbol and cross-references from a symbol into a single graph, custom cross-reference graphs allow you to specïfy a maximum recursion depth and the types of symbols that should be ïncluded or excluded from the resulting graph.
 +
 +
Vïew > Graphs > User Xrefs Chart opens the graph customization dialog shown in Figure 9-11. Each global symbol that cœurs within the specïfied address range appears as a node within the resulting graph, which is constructed according to the options specïfied in the dïalog. In the most common case, generating cross-references from a single symbol, the start and end addresses are identicai. If the start and end addresses differ, then the resulting graph is generated for ail nonlocai symbols that occur within the specïfled range. In the extreme case where the start address is the lowest address in the database and the end address is the highest address in the database, the resulting graph degenerates to the fonction cafl graph for the entire binary.
 +
 +
The options that are selected in Figure 9-11 represent the default options for ail custom cross-reference graphs. Foiiowing is a description of the purpose of each set of options:
 +
 +
Starting direction
 +
 +
Options ailow you to decide whether to search for cross-references from the selected symbol, to the selected symbol, or both. If ail other options are ieft ai their default settings, restricting the starting direction to Cross references to resuits in an Xrefs To-style graph, whiie restricting direction to Cross references from generates an Xrefs From-style graph.
 +
 +
Parameters
 +
 +
The Recursive option enabies recursive descent (Xrefs From) or ascent (Xrefs To) from the selected symbois. Follow only current direction forces any recursion to cœur in only one direction. In other words, if this option is selected, and node B is dïscovered to be reachable from node A, the recursive descent into B adds additional nodes that can be reached only from node B. Newiy dïscovered nodes that refer to node B will not be added to the graph. If you choose to deseiect Foilow only current direction, then when both starting directions are selected, each new node added to the graph is recursed in both the to and Dviii directions.
 +
 +
Recursion depth
 +
 +
This option sets the maximum recursion depth and is useful for lirniting the size of generated graphs. A setting of -1 causes recursion to proceed as deep as possible and generates the largest possible graphs.
 +
 +
Ignore
 +
 +
These options dictate what types of nodes will be excluded from the generated graph. This is another means of restricting the size of the resuiting graph. In particular, ignoring cross-references from
 +
library fonctions can lead to drastic simplifications of graphs in statically linked binaries. The trick is to make sure that IDA recognizes as many library fonctions as possible. Lïbrary code recognition is the subject of Chapter 12.
 +
 +
Print options
 +
 +
These options control two aspects of graph formatting. Print comments causes any fonction comments to be ïncluded in a functions graph noce. If Print recursion dots is selected and recursion would continue beyond the specifïed recursion limit, a node contaïning an ellipsis is displayed to indicate that further recursion is possible.
 +
 +
Figure 9-12 shows a custom cross-reference graph generated for fonction depth_1 in our example program using default options and a recursion depth of 1.
 +
 +
User-generated cross-reference graphs are the most powerful externalmode graphing capability avaïlable in IDA. External flowcharts have largely been
 +
superseded by IDAs integrated graph-based disassembly view, and the remaining external graph types are simply canned versions of user-generated cross-reference graphs.
 +
 +
IDA's !ntegrated Graph View
 +
 +
With version 5.0, IDA introduced a long-awaited interactive, graph-based disassembly view that was tightly integrated into IDA. As mentioned previously, the integrated graphing mode provides an alternative interface to the standard text-style disassembly listing. While in graph mode, disassembled functions are displayed as control flow graphs similar to external-style flowchart graphs. Because a function-oriented control flow graph is used, only one fonction at a time can be displayed while in graph mode, and graph mode cannot be used for instructions that lie outside any function. For cases in which you wish to view several fonctions at once, or when you need to view instructions that are not part of a fonction, you must revert to the text-oriented disassembly listing.
 +
 +
We detailed basic manipulation of the graph view in Chapter 5, but we reiterate a few points here. Swïtchïng between text view and graph view is accomplished by pressing the spacebar or right-clicking anywhere in the disassembly window and selecting eïther Text View or Graph View as appropriate. The easiest way to pan around the graph is to click the background of the graph view and drag the graph in the appropriate direction. For large graphs, you may find it casier to pan using the Graph Overvïew window instead. The Graph Overview window always displays a dashed rectangle around the portion of the graph currently being displayed in the disassembly window. At any time, you can click and drag the dashed rectangle to reposition the graph dïsplay. Because the graph overview window dïsplays a miniature version of the entire graph, using it for panning eliminates the need to constantly release the mouse button and reposition the mouse as required when panning across large graphs in the disassembly window.
 +
 +
There are no significant differences between manipulating a disassembly in graph mode and manipulating a disassembly in text mode. Double-click navigation continues to work as you would expect it to, as does the navigation history list. Any time you navigate to a location that does not lie within a four-lion (such as a global variable), the dïsplay will automatïcally switch to text mode. Graph mode will automatïcally be restored once you navigate back to a fonction. Access to stack variables is identical to that of text mode, wïth the summary stack view being displayed in the root basic block of the displayed fonction. Detailed stack frame views are accessed by double-clicking any stack variable, just as in text mode. AIl options for formatting instruction operands in text mode remaïn available and are accessed in the same manner in graph mode.
 +
 +
The primary user interface change related to graph mode deals wïth the handing of individual graph nodes. Figure 9-13 shows a simple graph node and ils related titie bar button controls.
 +
 +
From left to right, the three buttons on the nodes titie bar allow you to change the background color of the node, assign or change the name of the node, and access the list of cross-references to the node. Coloring nodes isa useful way to remind yourself that you have already analyzed a node or to simply make it stand out [rom others, perhaps because it contains code of particular interest. Once you assign a node a color, the color is also used as the background color for the corresponding instructions in text mode. To easily remove any coloring, rïght-click the nodes title bar and select Set node color to default.
 +
 +
The middle button on the titie bar in Figure 9-13 is used to assigna name to the address of the first instruction of the nodes basic block. Since basic blocks are often the target ofjump instructions, many nodes may already have a dummy name assigned as the resuit of being targeted by a jump cross-reference. However, it is possible for a basic block to begin without having a name assigned. Consider the following lines of code:
 +
 +
.text:00401041 Ojg short 1oc_401053
 +
.text:00401043 Omov ecx, [ebp+arg_o]
 +
 +
The instruction at O bas two potential successors, lac_401053 and the instruction at O. Because it bas two successors, O must terminate a basic block, whïch results in e becoming the fïrst instruction in a new basic block, even though it is not targeted explïcitly by ajump and thus bas no dummy name assigned.
 +
 +
The rightmost button in Figure 9-13 is used to access the list of crossreferences that target the node. Since cross-reference comments are not displayed by default in graph mode, this is the easiest way to access and navigate to any location that references the node. Unlike the cross-reference lists we have discussed previously, the generated node cross-reference list also con-tains an entry for the ordïnary flow into the node (desïgnated by type A). This is requïred because it is not always obvious in graph view which node is the linear predecessor of  given node. If you wïsh to view normal cross-reference
 +
comments in graph mode, access the Cross-References tab under Options General and set the Number of displayed xrefs option to something other than zero.
 +
Nodes within a graph may be groupedeither by themselves or with other nodes in order to reduce soute of the clutter in a graph. To group multiple nodes, CTRL-click the title bar of each node to be grouped and then rïghtclick the title bar of any selected node and select Group nodes. You will be prompted to enter soute text (defaults to the fïrst instruction in the group) to be dïsplayed in the collapsed node. Figure 9-14 shows the resuit ofgrouping the node in Figure 9-13 and changing the node text to collapsedaode derno.
 +
 +
Note that two addïtional buttons are now present in the title bar. In leftto-right order, these buttons allow you to uncollapse (expand) the grouped node and edit the node text. Uncollapsing a node merely expands the nodes within a group to their original form; it dues not change the fact that the node or nodes now belong to a group. When a group is uncollapsed, the two new buttonsjust mentioned are removed and replaced with a single Collapse Group button. An expanded group can easily be collapsed again using the Collapse Group button or by right-clicking the title bar of any node in the group and selecting Hide Group. To completely remove a grouping applied to one or more nodes, you must rïght-click the title bar of the collapsed node or one of the participating uncollapsed nodes and select Ungroup Nodes. This action bas the side effect of expanding the group if it was collapsed at the time.

Version actuelle en date du 16 août 2019 à 02:13

Some of the more common questions asked vhile reverse engineering a hinary are along the unes of "Where is this function called from?" and "What fonctions access this data?" These and other similar questions seek to catalog the references to and from various resources in a program. Two examples serve to show the usefulness of such questions.

Consider the case in which you have located a function contaïning a stackallocated buffer that can be overflowed, possïbly leading to exploitation of the program. Since the function may be buried deep within a complex application, your next step mïght be to determine exactly how the function can be reached. The function is useless to you unless you can get it to execute. This leads to the question What functions caIl this vulnerable function?" as well as addïtional questions regarding the nature of the data that those fonctions may pass to the vulnerable function. This une of reasoning must continue as you work your way back up potential cail chains to [md one that you can influence to properly exploit the overflow that you have discovered.

In another case, consider a binary that contains a large number of ASCII strings, at least one of which you [md suspicious, such as "Executing Denial of Service attack!" Does the presence of this string indicate that the binary actuaiiy performs a Denial of Service attack? No, it simpiy indicates that the binary happens to contain that particular ASCII sequence. You might infer that the message is displayed somehowjust prior to launching an attack; however, you need to find the related code in order to verify your suspicions. Here the answer to the question "Where is this string referenced?" would help you to quickly track down the program location (s) that make use of the string. Front there, perhaps it can assist you in locating any actual Denial of Service attack code.

IDA helps to answer these types of questions through its extensive crossreferencing features. IDA provides a number of mechanisms for displaying and accessing cross-reference data, including graph-generatïon capabilities that provide a highly visual representation of the relatïonships between code and data. In this chapter we dïscuss the types of cross-reference information that IDA makes available, the tools for accessing cross-reference data, and how to interpret that data.

Cross-References

We begin our discussion by noting that cross-references within IDA are often referred to simply as xrefs. Within this test, we will use xrefonly where it is used to refer to the content of an IDA menu item or diaiog. In ail other cases we will stick to the term cross-reference. There are two basic categories of cross-references in IDA: code cross-references and data cross-references. Within each category, we will detail severai different types of cross-references. Assocïated with each cross-reference is the notion of a direction. AIl cross-references are made front one address to another address. The [rom and to addresses may be either code or data addresses. If you are familiar with graph theory, you may choose to think of addresses as nodes in a directed graph and cross-references as the edges in that graph. Figure 9-1 provides a quick refresher on graph terminology. In this simple graph, three nodes O are connected by two directed edges O.

Note that nodes may also be referred to as vertices. Directed edges are drawn using arrows to indicate the allowed direction of travel across the edge. In Figure 9-1, it is possible to travel front the upper node to either of the lower nodes, but it is not possible to travel [rom eïther of the lower codes to the upper ourle.

Code cross-references are a very important concept, as they facilitate IDAs generation of controlflowgraphs and fwiction rail graphs, each of which we discuss later in the chapter. Belote we dive into the detaïls of cross-references, it is useful to understand how IDA dïsplays cross-reference information in a disassembly listing. Figure 9-2 shows the brader line for a disassembled fonction (sub_401000) containing a cross-reference as a regular comment (right side of the figure).

int main() {
int p = &reT_it	//results in an 'offset style data reference
= read_it	//results in a "read' style data reference
write_it =	//results in a 'write' style data reference
callflowO;	//results in a "cali' style code reference
if (read_it == 3) { //results in "jump style code reference
write_it = 2;	//results in a "write" style data reference
}
else {	//results in an "jump" style code reference
write_it = 1;	//results in a "write" style data reference
}
callflowO;	//results in an "cail" style code reference

The program contains operations that will exercise ail of IDAs crossreferencing features, as noted in the comment text.

An ordinaryflowis the sirnplest flow type, and it represents sequential flow from one instruction to another. This is the default execution flow for A nonbranching instructions such as ADD. There are no special dïsplay mdicators for ordinary fiows other than the order in whïch instructions are hsted in the disassernbly. If instruction A has an ordinary flow to instruction B, then instruction B wili imrnedïateiy foilow instruction A in the disassernbiy listing. In the following listing, every instruction other than O and O has an associated ordinary flow to lis immediate successor:

Instructions used to invoke fonctions, such as the x86 cali instructions at O, are assigned a rail flow, indicating transfer of control to the target [une-lion. In most cases, an ordinary flow is also assigned to cali instructions, as most fonctions return to the location that foliows the cali. If IDA believes that a fonction dors not return (as deterrnïned during the analysis phase), then calis to that fonction will not have an ordinary flow assigned. Cali fiows are noted by the display of cross-references at the target fonction (the destination address of the flow). The resulting disassembly of the callflow four-lion is shown here:

In this example, two cross-references are displayed at the address of callflow to indicate that the fonction is called twïce. The address displayed in the cross-references is displayed as an offset into the calling fonction unless the calling address bas an associated riante, in which case the riante is used. Both forms of addresses are used in the cross-references shown here. Cross-references resulting front fonction rails are distïnguïshed through use of the p sufflx (thïnkP for Procedure). Ajwnpflowïs assigned to each unconditional and conditional branch instruction. Conditional branches are also assigned ordinary fiows to accourut for control flow when the branch is not taken. Unconditional branches have no associated ordinary flow because the branch is aiways taken in such cases. The dashed une break at O is a display device used to indicate that an ordinary flow dors not exist between two adjacent instructions. Jump fiows are associated withjurnp-style cross-references displayed at the target ofthejump, as shown at 0. As with cali-style cross-references, jurnp cross-references dis-play the address of the referring location (the source of the jurnp). Jump cross-references are distinguished by the use of a j suffix (think Jfor Jurnp.

Data Cross-11e ferences

Data cross-references are used to track the manner in whïch data is accessed within a binary. Data cross-references can be associated with any byte in an IDA database that is associated with a virtual address (in other words, data cross-references are neyer associated with stack variables). The three most commonly encountered types of data cross-references are used to indicate when a location is being read, when a location is being written, and when the address of a location is being taken. The global variables associated with the previous example program are shown here, as they provide several examples of data cross-references.

A readcmss-referenceis used to indicate that the contents of meniory location are being accessed. Read cross-references can originate only front an instruction address but may refer to any prograni location. The global variable read _it is read at locations niarked O in Listing 9-1. The assocïated crossreference continents shown in this listing indicate exactly whïch locations in nain are referencing read_it and are recognizable as read cross-references based on the use of the r suffix. The fïrst read perforrned on read _it is a 32- bit read into the ECX register, whïch leads IDA to format read it as a dword (dd). In general IDA takes as niany cues as it possibly can in order to determine the size and/or type of variables based on how they are accessed and how they are used as parameters to fonctions.

The global variable write _it is referenced at the locations marked O in Listing 9-1. Associated wrte cmss-re/'erences are generated and displayed as comments for the write it variable, indicating the program locations that modïfy the contents of the variable. Write cross-references utilize the w suffix. Here again, IDA bas determined the size of the variable based on the fact that the 32-bit EAX register is copïed into write _it. Note that the lïst of crossreferences displayed at write_it terminales wïth an ellipsis (C above), mdicating that the nuniber of cross-references to write _it exceeds the current display lirnit for cross-references. This limit can be modïfied through the Nom-ber of displayed xrefs setting on the Cross-references tab in the Options General dialog. As wïth read cross-references, write cross-references can originate only froni a prograni instruction but may reference any program location. Generally speaking, a write cross-reference that targets a program instruction byte is indicative of self-modifyïng code, whïch is usually considered bad form and is frequently encountered in the de-obfuscation routines used in malware.

The third type of data cross-reference, an offset cross-reference, indicates that the address of a location is being used (rather than the content of the location). The address of global variable ref_it is taken at location O in Listing 9-1, resulting in the offset cross-reference comment at re-f_it in the previous listing (suffix o). Offset cross-references are coninionly the result of pointer operations eïther in code or in data. Array access operations, for example, are typically implemented by adding an offset to the starting address of the array. As a result, the fïrst address in most global arrays can often be recognized by the presence of an offset cross-reference. For this reason, most string data (strings being arrays of characters in C/C++) is the target of offset cross-references. Unlike read and write cross-references, which can originate only from instruction locations, offset cross-references can originate from either instruction locations or data locations. An example of an offset that can originate from a programs data section is any table of pointers (such as a vtable) that resuits in the generation of an offset cross-reference from each location within the table to the location being pointed to by those locations. You can sec this if you examine the vtable for class subclass from Chapter 8, whose disassembly is shown here:

Here you see that the address of the vtable is used in the fonction Subclass: :SubClass(void), which is the class constructor. The header lines for fonction SubClass: :v-func3(void), shown here, show the offset crossreference that links the fonction to a vtable.

This example demonstrates one of the characterïstïcs of C++ virtual fonctions that becomes quite obvious when combined wïth offset cross-references, namely that C++ virtual fonctions are neyer called dïrectly and should neyer be the target of a cail cross-reference. Instead, ail C++ virtual fonctions should be referred to by at least one vtable entry and should always be the target of at least one offset cross-reference. Remember that overriding a virtual fonction is not mandatory. Therefore, a virtual fonction can appear in more than one vtable, as discussed in Chapter 8. Backtracking offset cross-references is one technique for easily locating C++ viables in a programs data section.

Cross-reference Lists

With an understandïng of what cross-references are, we can now discuss the manner in which you may access ail of this data within IDA. As mentïoned previously, the number of cross-reference comments that can be dispiayed at a given location is limïted by a configuration setting that defauits to 2. As long as the number of cross-references to a location dues not exceed this limit, then working wïth those cross-references is faïrly straightforward. Mousing over the cross-reference text displays the dïsassembiy of the source region in a tool tip-style display, whïie double-clicking the cross-reference addressjumps the disassembiy wïndow to the source of the cross-reference.

There are two methods for viewing the complete list of cross-references to a location. The first method is to open a cross-references subvïew associated with a specific address. By positioning the cursor on an address that is the target of one or more cross-references and selecting Vïew > Open Subviews > Cross-References, you can open the complete list of crossreferences to a given location, as shown in Figure 9-3, which shows the complete list of cross-references to variable write_it.

The columns of the window indicate the direction (Up or Down) to the source of the cross-reference, the type of cross-reference (using the type suffixes discussed prevïously), the source address of the cross-reference, and the corresponding disassembled text at the source address, including any comments that may exist at the source address. As with other windows that display lïsts of addresses, double-clicking any entry repositions the disassernbly display to the corresponding source address. Once opened, the cross-reference dis-play window rernains open and accessible via a title tab dïsplayed along with every other open subview's title tab above the disassernbly area.

The second way to access a list of cross-references is to highlight a name that you are interested in learning about and choose Jump > Jump to xref (hotkey CTRL-X) to open a dialog that lïsts every location that references the selected symbol. The resulting dialog, shown in Figure 9-4, is nearly identical in appearance to the cross-reference subview shown in Figure 9-3. In this case, the dialog was actïvated using the CTRL-X hotkey with the first instance ofwriteit (.text :0040102B) selected.

The primary difference in the two displays is behavioral. Being a modal dialog,' the display in Figure 9-4 bas buttons to interact with and terminate the dialog. The primary purpose of this dialog is to select a referencing location and jump toit. Double-clicking one of the listed locations dismisses the dialog and repositions the disassembly window at the selected location. The second difference between the dialog and the cross-reference subview is that the former can be opened using a hotkey or context-sensitive menu from any instance of a symbol, while the latter can be opened only when you position the cursor on an address that is the target of a cross-reference and choose View > Open Subviews > Cross-References. Another way of thinking about il is that the dialog can be opened at the source of any cross-reference, while the subview can be opened only at the destination of the cross-reference.

An example of the usefulness of cross-reference lists might be to rapidly locale every location from which a particular fonction is called. Many people consider the use of the C strcpy2 fonction to be dangerous. Using crossreferences, locating every cail to strcpy is as simple as flnding any one calI to strcpy, using the CTRL-X hotkey to bring up the cross-reference dialog, and working your way through every calI cross-reference. If you dont want to take the time to find strcpy used somewhere in the binary, you can even get away with adding a comment with the text strcpy in il and activating the crossreference dialog using the comment.3

Function Calis

A specialized cross-reference listing dealing exclusively with fonction rails is available by choosing View > Open Subviews > Fonction CaIls. Figure 9-5 shows the resulting dialog, whïch lïsts all locations that calI the current four-lion (as defïned by the cursor location at the time the view is opened) in the upper hall of the window and ail calls made by the current fonction in the lower hall of the window.

Here agaïn, each lïsted cross-reference can be used to quïckly reposition the dïsassembly listing to the corresponding cross-reference location. Restrïcting ourselves to considering fonction rail cross-references allows us to think about more abstract relationships than simple mappings from one address to another and ïnstead consider how fonctions relate to one another. In the next section, we show how IDA takes advantage of ibis by providing severai types of graphs, ail designed to assist you in interpreting a binary.

IDA Graphing

Because cross-references relate one address to another, they are a naturai place to begin if we want to make graphs of our binaries. By restricting ourselves to speciflc types of cross-references, we cari derive a number of useful graphs for anaiyzing our binaries. For starters, cross-references serve as the edges (the unes that connect points) in our graphs. Depending on the type of graph we wish to generate, individual nodes (the points in the graph) can be individuai instructions, groups of instructions calied basic blocks, or entire fonctions. IDA bas two distinct graphing capabiiïtïes: an externai graphing capabiiïty utiiïzing a bundled graphing application and an integrated, interactive graphing capabihty. Both of these graphing capabilities are covered in the foliowing sections.

IDA External (Third-Party) graphing

IDAs external graphing capability utilizes third-party graphing applications to dispiay IDA-generated graph flics. For Windows versions prior to 6.1, IDA ships with a bundied graphing application named wingraph32.4 For IDA 6.0, non-Windows versions of IDA are configured to use the dotty5 graph viewer by default. Beginning with IDA 6. 1, aIl versions of IDA ship with and are configured to use the qwingraph6 graph viewer, which is a cross-platform Qt port of wingraph32. Whïle the dotty configuration options remain visible for Linux users, they are commented out by default. The graph viewer used by IDA may be configured by editing the GRAPH_VISUALIZER variable in <IDADIR.>/cfg/ida. cfg.

Whenever an external-style graph is requested, the source for the graph is generated and saved to a temporary file; then the desïgnated third-party graph viewer is launched to dïsplay the graph. IDA supports two graph specification ianguages, Graph Description Language' (GDL) and the DOT' language utïlïzed by the graphviz9 project. The graph specification language used by IDA may be configured by editing the GRAPH_FORNIAT variable in dDADIR>/ cfg/k/a.cfg. Legai values for this variable are DOT and GDL. You must ensure that the language you specify here is compatible with the viewer you have specified in GRAPHVISUALIZER.

Five types of graphs may be generated [rom the View > Graphs submenu Available external mode graphs include the following: • Fonction flowchart • Cail graph [or the entire binary • Graph of cross-references to a symbol • Graph of cross-references [rom a symbol • Customized cross-re[erence graph For two of these, the flowchart and the cail graph, IDA is capable o[generating and saving GDL (flot DOT) files for use independently of IDA. These options may be found on the File > Produce file submenu. Saving the specification file for other types of graphs may be possible if your configured graph viewer allows yen to save the currently displayed graph. A number of limitations exïst when dealing with any external graph. First and foremost is the fact that external graphs are not interactive. Manipulation of displayed external graphs is limited by the capabilities of your chosen external graph viewer (often only zoorning and panning).

BASIC BLOCKS

In a computer program, a basic black isa grouping o[ one or more instructions with a single entry to the beginning of the block and a single exit [rom the end of the block. In germai, amer than the last instruction, every instruction within a basic block transfers contrat to exactly one successar instruction within the block. Similarly, other thon the [irst instruction, every instruction in a basic block receives contrat from exactly one prcdcccssor instruction within the block. For the purposes of basic block determination, the tact that function calI instructions trans[er contrat outside the current fonction is generally ignored unless il is known that the function being called [ails to return normally. An important behavioral characteristic o[ basic blocks is that once the tirst instruction in a basic block is executed, the remainder of the block is guaranteed to execute to completion. This cru factor signi[icantly into runtime instrumentation of a program, since il is no longer necessary to set a breakpoint on every instruction in a program or even single sein the program in order to record which instructions have executed. Instead, breakpoints con be set on the first instruction of each basic block, and as each breakpoint is huit, every instruction in its associated block cou be marked as executed. The Process Stalker comportent of Pedram Amini's PaiMei* tramework performs in exactly mis marner.

  • PIease see htlp.'//pedrarn.redhive.corn/code/pairnei/.

External Flowcharts

With the cursor posïtioned within a function, View > Graphs > Flow Chart (hotkey F12) generates and dïsplays an external flowchart. The flowchart display is the external graph that most closely resembles IDAs integrated graph-based dïsassembly view. These are not the flowcharts you may have been taught during an ïntroductory programming class. Instead, these graphs mïght better be named "control flow graphs," as they group a fonctions instructions into basic blocks and use edges to indicate flow [rom one block to another.

Figure 9-6 shows a portion of the flowchart of a relatively simple four-lion. As you can sec, external flowcharts offer very littie in the way o[address information, which can make it difficuit to correlate the flowchart view to ils corresponding disassembly listing. Flowchart graphs are derived by following the ordinary andjump fiows for each instruction in a function, beginning with the entry point to the function.

External Cali Craphs

A function calI graph is useful for gaining a quick understanding of the hier -archy of function calis made within a program. Cali graphs are generated by creating a graph node for each function and then connecting function nodes based on the existence of a cail cross-reference from one function to another. The process of generating a cali graph for a single function can be viewed as a recursive descent through ail of the functions that are calied from the initial function. In many cases, it is sufficient to stop descending the cail tree once a library function is reached, as it is easïer to learn how the library function operates by reading documentation assocïated wïth the library rather than by attempting to reverse engïneer the compiled version of the function. In fact, in the case of a dynamicaily linked binary it is ont possible to descend into library fonctions, since the code for such fonctions is ont present within the dynamicaily linked binary. Statïcally linked binaries present a different challenge when generating graphs. Since staticaily linked binaries contaïn ail of the code for the libraries that have been linked to the program, related function calI graphs can become extremely large.

In order to dïscuss function cail graphs, we make use of the following trivial program that dues nothing other than create a simple hïerarchy of function calls:

#include <stdio.h>
void depth_21() {
printf(inside depth21\n")
void depth_22() {
fprintf(stderr, "inside depth_2_2\n")
void depth_1() {
depth_2_1Q
depth_2_2Q
printf(inside depth 1\n" )
int main() { depth_1Q;

After compiling a dynarnically linked binary using GNU gcc, we can ask IDA to generate a function cail graph using View > Graphs > Fonction CalIs, which should yïeld a graph sirnilar to that shown in Figure 9-7. In ibis instance we have truncated the left side of the graph sornewhat in order to offer a bit more detail. The calI graph associated with the main function can be seen within the circled area in the figure.

Alert readers may notice that the compiler bas substituted calis to pats and fwrite for printf and fprintf, respectively, as they are more efficient when printing static strings. Note that IDA utilizes different colors to represent different types of ondes in the graph, though the colors are not configurable in any way.'

Given the straïghtforward nature of the prevïous program listing, why dues the graph appear to be twice as crowded as it should be? The answer is that the compiler, as virtually ail compilers do, bas inserted wrapper code responsible for library initialization and termination as well as for configuring parameters properly prior to transferring control to the main fonction. Attempting to graph a statically linked version of the same program results in the nasty mess shown in Figure 9-8.

The graph in Figure 9-8 demonstrate a behavior of external graphs in general, namely that they are always scaled ïnitïally to display the entire graph, whïch can result in very cluttered displays. For this particular graph, the status bar at the bottom of the WinGraph32 window indicates that there are 946 nodes and 10,125 edges that happen to cross over one another in 100,182 locations. Other than demonstrating the complexïty of statically linked binaries, this graph is ail but unusable. No arnount of zoorning and panning will simplïfy the graph, and beyond that, there is no way to easily locate a specific function such as main other than by reading the label on each noce. By the time you have zoomed in enough to be able to read the labels assocïated wïth each code, only a few dozen ourles will fit within the display.

Two types of cross-reference graphs can be generated for global symbols (fonctions or global variables): cross-references to a symbol (View > Graphs Xrefs To) and cross-references from a symbol (View > Graphs > Xrefs From). To generate an Xrefs To graph, a recursive ascent is performed by backtracking ail cross-references to the selected symbol until a symbol to which no other symbols refer is reached. When analyzing a binary, you can use an Xrefs To

graph to answer the question, "What sequence of cails must be made to reach this fonction?" Figure 9-9 shows the use of an Xrefs To graph to dispiay the paths that can be foiiowed to reach the puts fonction.

Sïmiiariy, Xrefs To graphs can assïst you in visualizing ail of the locations that reference a global variable and the chain of fonction rails required to reach those locations. Cross-reference graphs are the only graphs capable of incorporating data cross-reference information.

In order to create an Xrefs From graph, a recursive descent is performed by foilowing cross-references from the selected symbol. If the symbol is a fonction name, only calI references from the fonction are foliowed, so data references to global variables do not show up in the graph. If the symbol is an inïtialized global pointer variable (meaning that it actuaily points to something), then the corresponding data offset cross-reference is foliowed. When you graph cross-references from a fonction, the effective behavior is a four-lion caii graph rooted at the selected fonction, as shown in Figure 9-10.

Unfortunately, the same cluttered graph problems exist when graphing fonctions with a complex cali graph.

Custom Cross-Reference Craphs

Custom cross-reference graphs, called tiser xref chartsin IDA, provide the maximum fiexibility in generating cross-reference graphs to suit your needs. In addition to combining cross-references to a symbol and cross-references from a symbol into a single graph, custom cross-reference graphs allow you to specïfy a maximum recursion depth and the types of symbols that should be ïncluded or excluded from the resulting graph.

Vïew > Graphs > User Xrefs Chart opens the graph customization dialog shown in Figure 9-11. Each global symbol that cœurs within the specïfied address range appears as a node within the resulting graph, which is constructed according to the options specïfied in the dïalog. In the most common case, generating cross-references from a single symbol, the start and end addresses are identicai. If the start and end addresses differ, then the resulting graph is generated for ail nonlocai symbols that occur within the specïfled range. In the extreme case where the start address is the lowest address in the database and the end address is the highest address in the database, the resulting graph degenerates to the fonction cafl graph for the entire binary.

The options that are selected in Figure 9-11 represent the default options for ail custom cross-reference graphs. Foiiowing is a description of the purpose of each set of options:

Starting direction

Options ailow you to decide whether to search for cross-references from the selected symbol, to the selected symbol, or both. If ail other options are ieft ai their default settings, restricting the starting direction to Cross references to resuits in an Xrefs To-style graph, whiie restricting direction to Cross references from generates an Xrefs From-style graph.

Parameters

The Recursive option enabies recursive descent (Xrefs From) or ascent (Xrefs To) from the selected symbois. Follow only current direction forces any recursion to cœur in only one direction. In other words, if this option is selected, and node B is dïscovered to be reachable from node A, the recursive descent into B adds additional nodes that can be reached only from node B. Newiy dïscovered nodes that refer to node B will not be added to the graph. If you choose to deseiect Foilow only current direction, then when both starting directions are selected, each new node added to the graph is recursed in both the to and Dviii directions.

Recursion depth

This option sets the maximum recursion depth and is useful for lirniting the size of generated graphs. A setting of -1 causes recursion to proceed as deep as possible and generates the largest possible graphs.

Ignore

These options dictate what types of nodes will be excluded from the generated graph. This is another means of restricting the size of the resuiting graph. In particular, ignoring cross-references from library fonctions can lead to drastic simplifications of graphs in statically linked binaries. The trick is to make sure that IDA recognizes as many library fonctions as possible. Lïbrary code recognition is the subject of Chapter 12.

Print options

These options control two aspects of graph formatting. Print comments causes any fonction comments to be ïncluded in a functions graph noce. If Print recursion dots is selected and recursion would continue beyond the specifïed recursion limit, a node contaïning an ellipsis is displayed to indicate that further recursion is possible.

Figure 9-12 shows a custom cross-reference graph generated for fonction depth_1 in our example program using default options and a recursion depth of 1.

User-generated cross-reference graphs are the most powerful externalmode graphing capability avaïlable in IDA. External flowcharts have largely been superseded by IDAs integrated graph-based disassembly view, and the remaining external graph types are simply canned versions of user-generated cross-reference graphs.

IDA's !ntegrated Graph View

With version 5.0, IDA introduced a long-awaited interactive, graph-based disassembly view that was tightly integrated into IDA. As mentioned previously, the integrated graphing mode provides an alternative interface to the standard text-style disassembly listing. While in graph mode, disassembled functions are displayed as control flow graphs similar to external-style flowchart graphs. Because a function-oriented control flow graph is used, only one fonction at a time can be displayed while in graph mode, and graph mode cannot be used for instructions that lie outside any function. For cases in which you wish to view several fonctions at once, or when you need to view instructions that are not part of a fonction, you must revert to the text-oriented disassembly listing.

We detailed basic manipulation of the graph view in Chapter 5, but we reiterate a few points here. Swïtchïng between text view and graph view is accomplished by pressing the spacebar or right-clicking anywhere in the disassembly window and selecting eïther Text View or Graph View as appropriate. The easiest way to pan around the graph is to click the background of the graph view and drag the graph in the appropriate direction. For large graphs, you may find it casier to pan using the Graph Overvïew window instead. The Graph Overview window always displays a dashed rectangle around the portion of the graph currently being displayed in the disassembly window. At any time, you can click and drag the dashed rectangle to reposition the graph dïsplay. Because the graph overview window dïsplays a miniature version of the entire graph, using it for panning eliminates the need to constantly release the mouse button and reposition the mouse as required when panning across large graphs in the disassembly window.

There are no significant differences between manipulating a disassembly in graph mode and manipulating a disassembly in text mode. Double-click navigation continues to work as you would expect it to, as does the navigation history list. Any time you navigate to a location that does not lie within a four-lion (such as a global variable), the dïsplay will automatïcally switch to text mode. Graph mode will automatïcally be restored once you navigate back to a fonction. Access to stack variables is identical to that of text mode, wïth the summary stack view being displayed in the root basic block of the displayed fonction. Detailed stack frame views are accessed by double-clicking any stack variable, just as in text mode. AIl options for formatting instruction operands in text mode remaïn available and are accessed in the same manner in graph mode.

The primary user interface change related to graph mode deals wïth the handing of individual graph nodes. Figure 9-13 shows a simple graph node and ils related titie bar button controls.

From left to right, the three buttons on the nodes titie bar allow you to change the background color of the node, assign or change the name of the node, and access the list of cross-references to the node. Coloring nodes isa useful way to remind yourself that you have already analyzed a node or to simply make it stand out [rom others, perhaps because it contains code of particular interest. Once you assign a node a color, the color is also used as the background color for the corresponding instructions in text mode. To easily remove any coloring, rïght-click the nodes title bar and select Set node color to default.

The middle button on the titie bar in Figure 9-13 is used to assigna name to the address of the first instruction of the nodes basic block. Since basic blocks are often the target ofjump instructions, many nodes may already have a dummy name assigned as the resuit of being targeted by a jump cross-reference. However, it is possible for a basic block to begin without having a name assigned. Consider the following lines of code:

.text:00401041 Ojg short 1oc_401053 .text:00401043 Omov ecx, [ebp+arg_o]

The instruction at O bas two potential successors, lac_401053 and the instruction at O. Because it bas two successors, O must terminate a basic block, whïch results in e becoming the fïrst instruction in a new basic block, even though it is not targeted explïcitly by ajump and thus bas no dummy name assigned.

The rightmost button in Figure 9-13 is used to access the list of crossreferences that target the node. Since cross-reference comments are not displayed by default in graph mode, this is the easiest way to access and navigate to any location that references the node. Unlike the cross-reference lists we have discussed previously, the generated node cross-reference list also con-tains an entry for the ordïnary flow into the node (desïgnated by type A). This is requïred because it is not always obvious in graph view which node is the linear predecessor of given node. If you wïsh to view normal cross-reference comments in graph mode, access the Cross-References tab under Options General and set the Number of displayed xrefs option to something other than zero. Nodes within a graph may be groupedeither by themselves or with other nodes in order to reduce soute of the clutter in a graph. To group multiple nodes, CTRL-click the title bar of each node to be grouped and then rïghtclick the title bar of any selected node and select Group nodes. You will be prompted to enter soute text (defaults to the fïrst instruction in the group) to be dïsplayed in the collapsed node. Figure 9-14 shows the resuit ofgrouping the node in Figure 9-13 and changing the node text to collapsedaode derno.

Note that two addïtional buttons are now present in the title bar. In leftto-right order, these buttons allow you to uncollapse (expand) the grouped node and edit the node text. Uncollapsing a node merely expands the nodes within a group to their original form; it dues not change the fact that the node or nodes now belong to a group. When a group is uncollapsed, the two new buttonsjust mentioned are removed and replaced with a single Collapse Group button. An expanded group can easily be collapsed again using the Collapse Group button or by right-clicking the title bar of any node in the group and selecting Hide Group. To completely remove a grouping applied to one or more nodes, you must rïght-click the title bar of the collapsed node or one of the participating uncollapsed nodes and select Ungroup Nodes. This action bas the side effect of expanding the group if it was collapsed at the time.