Chapter 14 – Additional IM Clients


The IM client Pidgin is compatible with multiple chat protocols, including AIM, ICQ, Google Talk, Jabber/XMPP, MSN Messenger, Yahoo!, Bonjour, Gadu-Gadu, IRC, Novell GroupWise Messenger, Lotus Sametime, SILC, SIMPLE, MXit, MySpaceIM, and Zephyr. Pidgin was formerly named GAIM (GTK+ AOL Instant Messenger). Pidgin uses the open source libpurple library to interact with the various chat protocols. Features include chat sessions, group chats, and file transfers. File transfers and group chats are limited by protocol, however. For example, file transfers are not enabled in the AIM chat sessions.

Log Storage

Pidgin logs are stored in HTML files on the local host. Each protocol used will have its own directory, and for each protocol there will be separate subfolders named after the user’s account name. Within the account name are additional folders, one for each contact there was a conversation with. The log file is named using the convention YYYY-MM-DD.HHMMSSTZ.html, where TZ is the local time zone expressed as an offset plus or minus from UTC, as well as the common three-letter representation (for example 2013-06-20.154512-0400EDT.html). The location of the logs are listed by operating system in the following table:

Operating System Path
Windows Vista/7 C:\Users\{Windows_profile}\AppData\Roaming\.purple\logs\{protocol}\{account_profile_name}\{contact_profile_name}
Windows 2000/XP C:\Documents and Settings\Application Data\Local\.purple\logs\{protocol}\{account_profile_name}\{contact_profile_name}
Linux /home/{Linux_profile}/.purple\logs\{protocol}\{account_profile_name}\{contact_profile_name}

Log Format

The format of the log is HTML. Each entry will begin with a “font color” tag, size, and a timestamp in local machine time. This is followed by the nickname of the person sending the message, then the message itself in the body field. Here’s an example:

<font color="#A82F2F"><font size="2">(11:07:05 AM)</font>
 <b>{Nickname}:</b></font> Hey there.<br/>


Additional artifacts that can be found in Pidgin include account data, a buddy list, and the custom icon for the user’s chat contacts. These can be found in the user’s profile directory in the “.purple” directory:

  • accounts.xml This XML document contains information relevant to each individual protocol used with Pidgin, such as AIM, Yahoo!, or Facebook. If the user stored a password with the client, the password will be in the XML tag <password> in plain text. The following is the password section viewed in the Notepad++ text editor. In this case, the password was actually “plaintext”:

  • blist.xml The blist XML document includes information about the user’s stored buddy data. This includes the buddy’s screen name, the last time the buddy was seen online, and the name of the buddy’s icon.
  • icon\ The icon folder includes the buddy icons for the user’s chat partners. The blist XML document includes the mapping to the names of the buddy and the icon pictures.


Pidgin preferences are stored in the user’s profile path in the .purple directory in a file named prefs.xml. The settings are stored as an XML document. Preferences include data logging options. Here’s an example:

      <pref name='logging'>
             <pref name='log_ims' type='bool' value='1'/>
             <pref name='log_chats' type='bool' value='1'/>
             <pref name='log_system' type='bool' value='0'/>
             <pref name='format' type='string' value='html'/>

A value of ‘1’ indicates true whereas ‘0’ indicates false. In the example, group chats and IMs were stored as string values in HTML format.


Web browsers and text editors are the most common tools to view Pidgin logs. You can also use a regular expression search to search memory, page, and unallocated space for the nicknames of the chat subjects. Searching for the font color tags can also be useful if the nicknames are unknown. For example, you can search for

<font color="#


The most common chat client for the IRC (Internet Relay Chat) is mIRC. It was developed for Microsoft Windows in 1995 and has been popular ever since. The chat interface uses a GUI to access chat channels that are simple plain text (with color) chat channels and private messages. IRC is still used by technically savvy users, criminal groups, and nostalgic people today. The features of the client are designed to assist a user in saving preferred channels and groups and to establish hotkey access for plain text commands.

Log Storage

Logs for mIRC are stored in plain text format. The client enables logging of chat groups and private messages by default. mIRC’s log naming convention is


For example, #testchannel.UnderNet.log would be the log file for the channel #testchannel on the UnderNet IRC network. Log locations are listed in the following table for specific operating systems:

Operating System Path
Windows Vista/7 C:\Users\{Windows_profile}\AppData\Roaming\mIRC\logs\
Windows 2000/XP C:\Documents and Settings\{Windows_profile}\Application Data\mIRC\logs
Linux /home/{Linux_profile}/mIRC/logs

Log Format

The log file name identifies the individual chat session. A hash (#) at the beginning of a log indicates it was a chat room; otherwise, the log contains the contents of a private message session. The next field is the channel name or the user name of the other party in the chat. The next field is the server name. The last field is “.log.” Here’s an example:

  • #chat.EFnet.log

Chat room: chat

Server: EFnet

  • katkat.EFnet.log

Private message with: katkat

Server: EFnet

Each message will begin with a timestamp (in local machine time). The user name of the message sender will be bracketed by less-than (<) and greater-than (>) signs. If the user created a “pose,” the action would be preceded by an asterisk (*). If a bot performed the logged action, the bot name will be bracketed by dashes (-).


mIRC allows for file transfers in addition to chat sessions. By default, the files will be saved in the /mIRC/downloads folder. Custom sounds, scripts, and channel history can be found in the /mIRC/ folder as well. The channel list displays a list of chat sessions available at the time the user logged in to the server. Scripts and sounds are customized events that a user can populate.


Preferences and user settings are stored in the file /mIRC/mirc.ini. These settings include the user name. Each setting field is delimited by a label enclosed in brackets. Here’s an example:


This indicates the last active user ID was “Hammer” and the [ident] request was over port 113. The “system” field must always be populated with “UNIX” for compatibility with IRC servers, which use the Unix “ident” service to identify users. The “mirc” label is also helpful in identifying the user’s nickname used in the chat—which may or may not be different from the userid. An example is shown here:


Note that many of the fields do not require a value.


You can use plain text viewers such as Notepad and Notepad++ to review the logs saved by mIRC. Standard regular expressions and carving tools can help you identify chat fragments in memory, page files, or unallocated space on a device. Techniques for locating chats include searching for a user name or timestamps in brackets.

Google Talk

Google Talk is a built-in application in the Gmail web page. Although a stand-alone application does exist, we have never encountered it on an investigation, and Google is phasing it out. Google Talk uses the Extensible Messaging and Presence Protocol (XMPP). XMPP was originally named Jabber and is sometimes still referred to by that name. Separate IM applications support XMPP and therefore support Google Talk. Applications such as Pidgin, Trillian, and several mobile phone applications are available. Examining independent IM applications is specific to the client—logs, preference, format, and so on will follow the designs of the client, not the protocol. Messages are logged by default on the Gmail server in the All Messages folder for the user’s profile. You will have to follow the proper legal procedure to obtain the messages.

As we’ve mentioned before, applications and protocols change over time. Not surprisingly, in 2013, Google began transitioning Google Talk to its new Google Hangouts service. Ultimately, this means that Google Talk, including Google’s use of the open XMPP standard, will go away. Because the Hangout services are brand new, we will not cover them in this version of the book. You can learn more about Google Talk and the new Hangout services on Google’s developer website:

Log Storage

Google Talk log storage is saved on the Google server by default in the “All Mail” mail label. The messages can be saved by the recipient or sender manually via cut and paste or “print to document” operations.

Log Format

Chat messages in Google Talk are in a proprietary format. The individual messages are wrapped in a bracketed structure, with each field being delineated by a comma and wrapped in quotes. Known fields include a message ID, an unknown field (observed as “c” when we did our analysis), a Unix millisecond timestamp in UTC, the sender’s identification, the message in text, and the message with encoded Unicode and special characters. Here are some examples:

["135ECCAxxx779D79_10","c",1368377512985,0,"","I don't know. I don't see it here.","I don\u0026#39;t know. I don\u0026#39;t see it here.","",[] ,1]

Message ID: 135ECCAxxx779D79_10
Type: “c” (unknown field)
Timestamp: 1368377512985 (12 May, 2013 16:51:52 UTC)
Message body in text: “I don’t know. I don’t see it here.”
Message body in PHP/HTML: “I don’t know. I don’t see it here.”

NOTE: You may notice the original message has an odd string: don\u0026#39;t. This string contains encoding due to special characters that were present in the original text. Those special characters must be encoded for any application that displays HTML. In this case, \u0026 is a Unicode encoding for the ampersand character (&). Replacing that, you are left with &#39;t. The string “&#39;” is standard HTML-entity encoding, and in this case converts to an apostrophe. The final decoded text is “don’t.”
["purplexxx568cb","c",1368377485541,0,"","messagereturnedtojeff7","messagereturned\u003cwbr\u003etojeff7","",[] ,1]

Message ID: purplexxx568cb
Type: “c” (unknown field)
Timestamp: 1368377485541 (12 May, 2013 16:51:25 UTC)
Sender: “”
Message body in text: “messagereturnedtojeff7”
Message body in PHP/HTML: “messagereturnedtojeff7”

In this message, note that the string \u003cwbr\u003e decodes to <wbr>, which is the HTML word break directive.


Locating artifacts on the disk from the web-based Google Talk application will be challenging because the application is web based. A search for “” may reveal the accounts that were used on the system to send Google Talk messages. A memory image may contain recently opened or sent emails and conversations. To locate messages, you can attempt searching for the sender’s user name and may find chat fragments in memory. Additionally, you can attempt to search for this:


Because this string is short and not very unique, this search may result in many false positives.


Google Talk preferences are stored on the server. They may be obtained through proper legal process to Google.


There were no publicly available tools for searching or parsing Google Talk chats or fragments when we did our research for this chapter. We used a combination of hex editors and regular expression searches to locate Google Talk chat artifacts.