The Mac® Hacker’s Handbook Published by Wiley Publishing, Inc. 10475 Crosspoint Boulevard Indianapolis, IN 46256 www.wiley.com Copyright 2009 by Wiley Publishing, Inc., Indianapolis, Indiana Published simultaneously in Canada ISBN: 978-0-470-39536-3 Manufactured in the United States of America 10 9 8 7 6 5 4 3 2 1 Library of Congress Cataloging-in-Publication Data is available from the publisher. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permis- sion of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley. com/go/permissions. Limit of Liability/Disclaimer of Warranty: The publisher and the author make no representations or war- ranties with respect to the accuracy or completeness of the contents of this work and specifi cally disclaim all warranties, including without limitation warranties of fi tness for a particular purpose. No warranty may be created or extended by sales or promotional materials. The advice and strategies contained herein may not be suitable for every situation. This work is sold with the understanding that the publisher is not engaged in rendering legal, accounting, or other professional services. If professional assistance is required, the services of a competent professional person should be sought. Neither the publisher nor the author shall be liable for damages arising herefrom. The fact that an organization or Web site is referred to in this work as a citation and/or a potential source of further information does not mean that the author or the publisher endorses the information the organization or Web site may provide or recommendations it may make. Further, readers should be aware that Internet Web sites listed in this work may have changed or disappeared between when this work was written and when it is read. For general information on our other products and services please contact our Customer Care Department within the United States at (877) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002. Trademarks: Wiley and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affi liates, in the United States and other countries, and may not be used without written permis- sion. Mac is a registered trademark of Apple, Inc. All other trademarks are the property of their respective owners. Wiley Publishing, Inc. is not associated with any product or vendor mentioned in this book. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books. viii Contents QuickTime 47 .mov 47 RTSP 52 Conclusion 61 References 61 Chapter 3 Attack Surface 63 Searching the Server Side 63 Nonstandard Listening Processes 68 Cutting into the Client Side 72 Safari 75 All of Safari’s Children 77 Safe File Types 79 Having Your Cake 80 Conclusion 81 References 81 Part II Discovering Vulnerabilities 83 Chapter 4 Tracing and Debugging 85 Pathetic ptrace 85 Good Ol’ GDB 86 DTrace 87 D Programming Language 88 Describing Probes 89 Example: Using Dtrace 90 Example: Using ltrace 91 Example: Instruction Tracer/Code-Coverage Monitor 93 Example: Memory Tracer 95 PyDbg 96 PyDbg Basics 97 Memory Searching 98 In-Memory Fuzzing 99 Binary Code Coverage with Pai Mei 102 iTunes Hates You 108 Conclusion 111 References 112 Chapter 5 Finding Bugs 113 Bug-Hunting Strategies 113 Old-School Source-Code Analysis 115 Getting to the Source 115 Code Coverage 116 CanSecWest 2008 Bug 121 vi + Changelog = Leopard 0-day 122 Apple’s Prerelease-Vulnerability Collection 124 Fuzz Fun 125 Network Fuzzing 126 File Fuzzing 129 Conclusion 133 References 134 Chapter 6 Reverse Engineering 135 Disassembly Oddities 135 EIP-Relative Data Addressing 136 Messed-Up Jump Tables 137 Identifying Missed Functions 138 Reversing Obj-C 140 Cleaning Up Obj-C 141 Shedding Light on objc_msgSend Calls 145 Contents ix Case Study 150 Patching Binaries 154 Conclusion 156 References 157 Part III Exploitation 159 Chapter 7 Exploiting Stack Overfl ows 161 Stack Basics 162 Stack Usage on PowerPC 163 Stack Usage on x86 164 Smashing the Stack on PowerPC 165 Smashing the Stack on x86 170 Exploiting the x86 Nonexecutable Stack 173 Return into system() 173 Executing the Payload from the Heap 176 Finding Useful Instruction Sequences 181 PowerPC 181 x86 182 Conclusion 184 References 184 Chapter 8 Exploiting Heap Overfl ows 185 The Heap 185 The Scalable Zone Allocator 186 Regions 186 Freeing and Allocating Memory 187 Overwriting Heap Metadata 192 Arbitrary 4-Byte Overwrite 193 Large Arbitrary Memory Overwrite 195 Obtaining Code Execution 197 Taming the Heap with Feng Shui 201 Fill ’Er Up 201 Feng Shui 202 WebKit’s JavaScript 204 Case Study 207 Feng Shui Example 209 Heap Spray 211 References 212 Chapter 9 Exploit Payloads 213 Mac OS X Exploit Payload Development 214 Restoring Privileges 215 Forking a New Process 215 Executing a Shell 216 Encoders and Decoders 217 Staged Payload Execution 217 Payload Components 218 PowerPC Exploit Payloads 219 execve_binsh 221 system 223 decode_longxor 225 tcp_listen 231 tcp_connect 232 tcp_fi nd 233 dup2_std_fds 234 vfork 235 Testing Simple Components 236 Putting Together Simple Payloads 237 Intel x86 Exploit Payloads 238 x Contents remote_execution_loop 241 inject_bundle 244 Testing Complex Components 254 Conclusion 259 References 259 Chapter 10 Real-World Exploits 261 QuickTime RTSP Content-Type Header Overfl ow 262 Triggering the Vulnerability 262 Exploitation on PowerPC 263 Exploitation on x86 273 mDNSResponder UPnP Location Header Overfl ow 276 Triggering the Vulnerability 277 Exploiting the Vulnerability 279 Exploiting on PowerPC 283 QuickTime QTJava toQTPointer() Memory Access 287 Exploiting toQTPointer() 288 Obtaining Code Execution 290 Conclusion 290 References 290 Part IV Post-Exploitation 291 Chapter 11 Injecting, Hooking, and Swizzling 293 Introduction to Mach 293 Mach Abstractions 294 Mach Security Model 296 Mach Exceptions 297 Mach Injection 300 Remote Threads 301 Remote Process Memory 306 Loading a Dynamic Library or Bundle 307 Inject-Bundle Usage 311 Example: iSight Photo Capture 311 Function Hooking 314 Example: SSLSpy 315 Objective-C Method Swizzling 318 Example: iChat Spy 322 Conclusion 326 References 326 Chapter 12 Rootkits 327 Kernel Extensions 327 Hello Kernel 328 System Calls 330 Hiding Files 332 Hiding the Rootkit 342 Maintaining Access across Reboots 346 Controlling the Rootkit 349 Creating the RPC Server 350 Injecting Kernel RPC Servers 350 Calling the Kernel RPC Server 352 Remote Access 352 Hardware-Virtualization Rootkits 354 Hyperjacking 355 Rootkit Hypervisor 356 Conclusion 358 References 358 Index 367 xii Foreword with a Unix terminal a click away. Here was a box I could run Microsoft Offi ce on that came with Apache by default and still held full man pages. As I delved into Applescript, plists, DMGs, and the other minutia of OS X, I was amazed by the capabilities of the operating system, and the breadth and depth of tools available. But as I continued to switch completely over to Apple, especially after the release of Intel Macs, my fi ngers started creeping around for those cracks at the edges again. I wasn’t really worried about viruses, but, as a security professional, I started wondering if this was by luck or design. I read the Apple documenta- tion and realized fairly early that there wasn’t a lot of good information on how OS X worked from a security standpoint, other than some confi guration guides and marketing material. Mac security attitudes have changed a fair bit since I purchased that fi rst Mac Mini. As Macs increase in popularity, they face more scrutiny. Windows switchers come with questions and habits, more security researchers use Macs in their day-to-day work, the press is always looking to knock Apple down a notch, and the bad guys won’t fail to pounce on any profi table opportunity. But despite this growing attention, there are few resources for those who want to educate themselves and better understand the inner workings of the operating system on which they rely. That’s why I was so excited when Dino fi rst mentioned he and Charlie were working on this book. Ripping into the inner guts of Mac OS X and fi nding those edges to tear apart are the only ways to advance the security of the plat- form. Regular programming books and system overviews just don’t look at any operating system from the right perspective; we need to know how something breaks in order to make it stronger. And, as any child (or hacker) will tell you, breaking something is the most exhilarating way to learn. If you are a security professional, this book is one of the best ways to under- stand the strengths and weaknesses of Mac OS X. If you are a programmer, this book will not only help you write more secure code, but it will also help you in your general coding practices. If you are just a Mac enthusiast, you’ll learn how hackers look at our operating system of choice and gain a better understanding of its inner workings. Hopefully Apple developers will use this to help harden the operating system; making the book obsolete with every version. Yes, maybe a few bad guys will use it to write a few exploits, but the benefi ts of having this knowledge far outweigh the risks. For us hackers, even those of us of limited skills, this book provides us with a roadmap for exploring those edges, fi nding those cracks, and discovering new possibilities. For me, it’s the literary equivalent of sliding that beige plastic cover off my childhood friend’s fi rst Apple and gazing at the inner workings. —Rich Mogull Security Editor at TidBITS and Analyst at Securosis xiv Introduction How This Book Is Organized This book is divided into four parts, roughly aligned with the steps an attacker would have to take to compromise a computer: Background, Vulnerabilities, Exploitation, and Post-Exploitation. The fi rst part, consisting of Chapters 1–3, contains introductory material concerning Mac OS X. It points out what makes this operating system different from Linux or Windows and demonstrates the tools that will be needed for the rest of the book. The next part, consisting of Chapters 4–6, demonstrates the tools and techniques necessary to identify security vulnerabilities in the operating system and applications running on it. Chapters 7–10 make up the next part of the book. These chapters illustrate how attackers can take the weaknesses found in the earlier chapters and turn them into functional exploits, giving them the ability to compromise vulnerable machines. Chapters 11 and 12 make up the last part of the book, which deals with what attackers may do after they have exploited a machine and techniques they can use to maintain continued access to the compromised machines. Chapter 1 begins the book with the basics of the way Mac OS X is designed. It discusses how it originated from BSD and the changes that have been made in it since that time. Chapter 1 gives a brief introduction to many of the tools that will be needed in the rest of the book. It highlights the differences between Mac OS X and other operating systems and takes care to demonstrate how to perform common tasks that differ among the operating systems. Finally, it outlines and analyzes some of the security improvements made in the release of Leopard, the current version of Mac OS X. Chapter 2 covers some uncommon protocols and fi le formats used by Mac OS X. This includes a description of how Bonjour works, as well as an inside look at the Mac OS X implementation, mDNSResponder. It also dissects the QuickTime fi le format and the RTSP protocol utilized by QuickTime Player. Chapter 3 examines what portions of the operating system process attacker- supplied data, known as the attack surface. It begins by looking in some detail at what services are running by default on a typical Mac OS X computer and examines the diffi culties in attacking these default services. It moves on to consider the client-side attack surface, all the code that can be executed if an attacker can get a client program such as Safari to visit a server the attacker controls, such as a malicious website. Chapter 4 dives into the world of debugging in a Mac OS X environment. It shows how to follow along to see what applications are doing internally. It covers in some detail the powerful DTrace mechanism that was introduced in Leopard. It also outlines the steps necessary to capture code-coverage informa- tion using the Pai Mei reverse-engineering framework. Chapter 5 demonstrates how to fi nd security weaknesses in Mac OS X soft- ware. It talks about how you can look for bugs in the source code Apple makes available or use a black-box technique such as fuzzing. It includes detailed instructions for performing either of these methods. Finally, it shows some tricks Introduction xv to take advantage of the way Apple develops its software, which can help fi nd bugs it doesn’t know about or give early warning of those it does. Chapter 6 discusses reverse engineering in Mac OS X. Given that most of the code in Mac OS X is available in binary form only, this chapter discusses how this software works statically. It also highlights some differences that arise in reverse engineering code written in Objective-C, which is quite common in Mac OS X binaries but rarely seen otherwise. Chapter 7 begins the exploitation part of the book. It introduces the simplest of buffer-overfl ow attacks, the stack overfl ow. It outlines how the stack is laid out for both PowerPC and x86 architectures and how, by overfl owing a stack buffer, an attacker can obtain control of the vulnerable process. Chapter 8 addresses the heap overfl ow, the other common type of exploit. This entails describing the way the Mac OS X heap and memory allocations function. It shows techniques where overwriting heap metadata allows an attacker to gain complete control of the application. It fi nishes by showing how to arrange the heap to overwrite other important application data to compro- mise the application. Chapter 9 addresses exploit payloads. Now that you know how to get control of the process, what can you do? It demonstrates a number of different possible shellcodes and payloads for both PowerPC and x86 architectures, ranging from simple to advanced. Chapter 10 covers real-world exploitation, demonstrating a large number of advanced exploitation topics, including many in-depth example exploits for Tiger and Leopard on both PowerPC and x86. If Chapters 7–9 were the theory of attack, then this chapter is the practical aspect of attack. Chapter 11 covers how to inject code into running processes using Mac OS X–specifi c hooking techniques. It provides all the code necessary to write and test such payloads. It also includes some interesting code examples of what an attacker can do, including spying on iChat sessions and reading encrypted network traffi c. Chapter 12 addresses the topic of rootkits, or code an attacker uses to hide their presence on a compromised system. It illustrates how to write basic kernel- level drivers and moves on to examples that will hide fi les from unsuspecting users at the kernel level. It fi nishes with a discussion of Mac OS X–specifi c root- kit techniques, including hidden in-kernel Mach RPC servers, network kernel extensions for remote access, and VT-x hardware virtual-machine hypervisor rootkits for advanced stealth. Who Should Read This Book This book is written for a wide variety of readers, ranging from Mac enthusiasts to hard-core security researchers. Those readers already knowledgeable about Mac OS X but wanting to learn more about the security of the system may want xvi Introduction to skip to Chapter 4. Conversely, security researchers may fi nd the fi rst few chapters the most useful, as those chapters reveal how to use the OS X–related skills they already possess. While the book may be easier to comprehend if you have some experience writing code or administering Mac OS X computers, no experience is necessary. It starts from the very basics and slowly works up to the more-advanced topics. The book is careful to illustrate the points it is making with many examples, and outlines exactly how to perform the steps required. The book is unique in that, although anybody with enthusiasm for the subject can pick it up and begin reading it, by the end of the book the reader will have a world-class knowledge of the security of the Mac OS X operating system. Tools You Will Need For the most part, all you need to follow along with this book is a computer with Mac OS X Leopard installed. Although many of the techniques and examples will work in earlier versions of Mac OS X, they are designed for Leopard. To perform the techniques illustrated in Chapter 6, a recent version of IDA Pro is required. This is a commercial tool that must be run in Windows and can be purchased at http://www.hex-rays.com. The remaining tools either come on supplemental disks, such as Xcode does, or are freely available online or at this book’s website. What’s on the Website This book includes a number of code samples. The small and moderately sized examples are included directly in this book. But to save you from having to type these in yourself, all the code samples are also available for download at www.wiley.com/go/machackershandbook. Additionally, some long code samples that are omitted from the book are available on the site, as are any other tools developed for the book. Final Note We invite you to dive right in and begin reading. We think there is something in this book for just about everyone who loves Mac OS X. I know we learned a lot in researching and writing this book. If you have comments, questions, hate mail, or anything else, please drop us a line and we’d be happy to discuss our favorite operating system with you. Chapter 1 ■ Mac OS X Architecture 5 system, networking, and I/O, to run as user-level Mach tasks. In earlier Mach- based UNIX systems, the UNIX layer ran as a server in a separate task. However, in Mac OS X, Mach and the BSD code run in the same address space. In XNU, Mach is responsible for many of the low-level operations you expect from a kernel, such as processor scheduling and multitasking and virtual- memory management. BSD The kernel also involves a large chunk of code derived from the FreeBSD code base. As mentioned earlier, this code runs as part of the kernel along with Mach and uses the same address space. The FreeBSD code within XNU may differ signifi cantly from the original FreeBSD code, as changes had to be made for it to coexist with Mach. FreeBSD provides many of the remaining operations the kernel needs, including Processes ■ Signals ■ Basic security, such as users and groups ■ System call infrastructure ■ TCP/IP stack and sockets ■ Firewall and packet fi ltering ■ To get an idea of just how complicated the interaction between these two sets of code can be, consider the idea of the fundamental executing unit. In BSD the fundamental unit is the process. In Mach it is a Mach thread. The disparity is settled by each BSD-style process being associated with a Mach task consisting of exactly one Mach thread. When the BSD fork() system call is made, the BSD code in the kernel uses Mach calls to create a task and thread structure. Also, it is important to note that both the Mach and BSD layers have different security models. The Mach security model is based on port rights, and the BSD model is based on process ownership. Disparities between these two models have resulted in a number of local privilege-escalation vulnerabilities. Additionally, besides typical system cells, there are Mach traps that allow user-space programs to communicate with the kernel. I/O Kit I/O Kit is the open-source, object-oriented, device-driver framework in the XNU kernel and is responsible for the addition and management of dynamically loaded device drivers. These drivers allow for modular code to be added to the kernel dynamically for use with different hardware, for example. The available drivers 6 Part I ■ Mac OS X Basics are usually stored in the /System/Library/Extensions/ directory or a subdirectory. The command kextstat will list all the currently loaded drivers, $ kextstat Index Refs Address Size Wired Name (Version) 1 1 0x0 0x0 0x0 com.apple.kernel (9.3.0) 2 55 0x0 0x0 0x0 com.apple.kpi.bsd (9.3.0) 3 3 0x0 0x0 0x0 com.apple.kpi.dsep (9.3.0) 4 74 0x0 0x0 0x0 com.apple.kpi.iokit (9.3.0) 5 79 0x0 0x0 0x0 com.apple.kpi.libkern (9.3.0) 6 72 0x0 0x0 0x0 com.apple.kpi.mach (9.3.0) 7 39 0x0 0x0 0x0 com.apple.kpi.unsupported (9.3.0) 8 1 0x0 0x0 0x0 com.apple.iokit.IONVRAMFamily (9.3.0) 9 1 0x0 0x0 0x0 com.apple.driver.AppleNMI (9.3.0) 10 1 0x0 0x0 0x0 com.apple.iokit.IOSystemManagementFamily (9.3.0) 11 1 0x0 0x0 0x0 com.apple.iokit.ApplePlatformFamily (9.3.0) 12 31 0x0 0x0 0x0 com.apple.kernel.6.0 (7.9.9) 13 1 0x0 0x0 0x0 com.apple.kernel.bsd (7.9.9) 14 1 0x0 0x0 0x0 com.apple.kernel.iokit (7.9.9) 15 1 0x0 0x0 0x0 com.apple.kernel.libkern (7.9.9) 16 1 0x0 0x0 0x0 com.apple.kernel.mach (7.9.9) 17 17 0x2e2bc000 0x10000 0xf000 com.apple.iokit.IOPCIFamily (2.4.1) <7 6 5 4> 18 10 0x2e2d2000 0x4000 0x3000 com.apple.iokit.IOACPIFamily (1.2.0) <12> 19 3 0x2e321000 0x3d000 0x3c000 com.apple.driver.AppleACPIPlatform (1.2.1) <18 17 12 7 5 4> … Many of the entries in this list say they are loaded at address zero. This just means they are part of the kernel proper and aren’t really device drivers—i.e., they cannot be unloaded. The fi rst actual driver is number 17. Besides kextstat, there are other functions you’ll need to know for loading and unloading these drivers. Suppose you wanted to fi nd and load the driver associated with the MS-DOS fi le system. First you can use the kextfi nd tool to fi nd the correct driver. $ kextfind -bundle-id -substring ‘msdos’ /System/Library/Extensions/msdosfs.kext Chapter 1 ■ Mac OS X Architecture 7 Now that you know the name of the kext bundle to load, you can load it into the running kernel. $ sudo kextload /System/Library/Extensions/msdosfs.kext kextload: /System/Library/Extensions/msdosfs.kext loaded successfully It seemed to load properly. You can verify this and see where it was loaded. $ kextstat | grep msdos 126 0 0x346d5000 0xc000 0xb000 com.apple.filesystems.msdosfs (1.5.2) <7 6 5 2> It is the 126th driver currently loaded. There are zero references to it (not sur- prising, since it wasn’t loaded before we loaded it). It has been loaded at address 0x346d5000 and has size 0xc000. This driver occupies 0xb000 wired bytes of kernel memory. Next it lists the driver’s name and version. It also lists the index of other kernel extensions that this driver refers to—in this case, looking at the full listing of kextstat, we see it refers to the “unsupported” mach, libkern, and bsd drivers. Finally, we can unload the driver. $ sudo kextunload com.apple.filesystems.msdosfs kextunload: unload kext /System/Library/Extensions/msdosfs.kext succeeded Darwin and Friends A kernel without applications isn’t very useful. That is where Darwin comes in. Darwin is the non-Aqua, open-source core of Mac OS X. Basically it is all the parts of Mac OS X for which the source code is available. The code is made available in the form of a package that is easy to install. There are hundreds of available Darwin packages, such as X11, GCC, and other GNU tools. Darwin provides many of the applications you may already use in BSD or Linux for Mac OS X. Apple has spent signifi cant time integrating these packages into their operating system so that everything behaves nicely and has a consistent look and feel when possible. On the other hand, many familiar pieces of Mac OS X are not open source. The main missing piece to someone running just the Darwin code will be Aqua, the Mac OS X windowing and graphical-interface environment. Additionally, most of the common high-level applications, such as Safari, Mail, QuickTime, iChat, etc., are not open source (although some of their components are open source). Interestingly, these closed-source applications often rely on open- source software, for example, Safari relies on the WebKit project for HTML and JavaScript rendering. For perhaps this reason, you also typically have many more symbols in these applications when debugging than you would in a Windows environment. 8 Part I ■ Mac OS X Basics Tools of the Trade Many of the standard Linux/BSD tools work on Mac OS X, but not all of them. If you haven’t already, it is important to install the Xcode package, which contains the system compiler (gcc) as well as many other tools, like the GNU debugger gdb. One of the most powerful tools that comes on Mac OS X is the object fi le displaying tool (otool). This tool fi lls the role of ldd, nm, objdump, and similar tools from Linux. For example, using otool you can use the –L option to get a list of the dynamically linked libraries needed by a binary. $ otool -L /bin/ls /bin/ls: /usr/lib/libncurses.5.4.dylib (compatibility version 5.4.0, current version 5.4.0) /usr/lib/libgcc_s.1.dylib (compatibility version 1.0.0, current version 1.0.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 111.0.0) To get a disassembly listing, you can use the –tv option. $ otool -tv /bin/ps /bin/ps: (__TEXT,__text) section 00001bd0 pushl $0x00 00001bd2 movl %esp,%ebp 00001bd4 andl $0xf0,%esp 00001bd7 subl $0x10,%esp … You’ll see many references to other uses for otool throughout this book. Ktrace/DTrace You must be able to trace execution fl ow for processes. Before Leopard, this was the job of the ktrace command-line application. ktrace allows kernel trace logging for the specifi ed process or command. For example, tracing the system calls of the ls command can be accomplished with $ ktrace -tc ls This will create a file called ktrace.out. To read this file, run the kdump command. $ kdump 918 ktrace RET ktrace 0 Chapter 1 ■ Mac OS X Architecture 9 918 ktrace CALL execve(0xbffff73c,0xbffffd14,0xbffffd1c) 918 ls RET execve 0 918 ls CALL issetugid 918 ls RET issetugid 0 918 ls CALL __sysctl(0xbffff7cc,0x2,0xbffff7d4,0xbffff7c8,0x8fe45a90,0xa) 918 ls RET __sysctl 0 918 ls CALL __sysctl(0xbffff7d4,0x2,0x8fe599bc,0xbffff878,0,0) 918 ls RET __sysctl 0 918 ls CALL __sysctl(0xbffff7cc,0x2,0xbffff7d4,0xbffff7c8,0x8fe45abc,0xd) 918 ls RET __sysctl 0 918 ls CALL __sysctl(0xbffff7d4,0x2,0x8fe599b8,0xbffff878,0,0) 918 ls RET __sysctl 0 … For more information, see the man page for ktrace. In Leopard, ktrace is replaced by DTrace. DTrace is a kernel-level tracing mechanism. Throughout the kernel (and in some frameworks and applications) are special DTrace probes that can be activated. Instead of being an application with some command-line arguments, DTrace has an entire language, called D, to control its actions. DTrace is covered in detail in Chapter 4, “Tracing and Debugging,” but we present a quick example here as an appetizer. $ sudo dtrace -n ‘syscall:::entry {@[execname] = count()}’ dtrace: description ‘syscall:::entry ‘ matched 427 probes ^C fseventsd 3 socketfilterfw 3 mysqld 6 httpd 8 pvsnatd 8 configd 11 DirectoryServic 14 Terminal 17 ntpd 21 WindowServer 27 mds 33 dtrace 38 llipd 60 SystemUIServer 69 launchd 182 nmblookup 288 smbclient 386 Finder 5232 Mail 5352 10 Part I ■ Mac OS X Basics Here, this one line of D within the DTrace command keeps track of the num- ber of system calls made by processes until the user hits Ctrl+C. The entire functionality of ktrace can be replicated with DTrace in just a few lines of D. Being able to peer inside processes can be very useful when bug hunting or reverse-engineering, but there will be more on those topics later in the book. Objective-C Objective-C is the programming language and runtime for the Cocoa API used extensively by most applications within Mac OS X. It is a superset of the C programming language, meaning that any C program will compile with an Objective-C compiler. The use of Objective-C has implications when applica- tions are being reverse-engineered and exploited. More time will be spent on these topics in the corresponding chapters. One of the most distinctive features of Objective-C is the way object-oriented programming is handled. Unlike in standard C++, in Objective-C, class meth- ods are not called directly. Rather, they are sent a message. This architecture allows for dynamic binding; i.e., the selection of method implementation occurs at runtime, not at compile time. When a message is sent, a runtime function looks at the receiver and the method name in the message. It identifi es the receiver’s implementation of the method by the name and executes that method. The following small example shows the syntactic differences between C++ and Objective-C from a source-code perspective. #include @interface Integer : Object { int integer; } - (int) integer; - (id) integer: (int) _integer; @end Here an interface is defi ned for the class Integer. An interface serves the role of a declaration. The hyphen character indicates the class’s methods. #import “Integer.h” @implementation Integer - (int) integer { return integer; } - (id) integer: (int) _integer Chapter 1 ■ Mac OS X Architecture 11 { integer = _integer; } @end Objective-C source fi les typically use the .m fi le extension. Within Integer.m are the implementations of the Integer methods. Also notice how arguments to functions are represented after a colon. One other small difference with C++ is that Objective-C provides the import preprocessor, which acts like the include directive except it includes the fi le only once. #import “Integer.h” @interface Integer (Display) - (id) showint; @end Another example follows. #include #import “Display.h” @implementation Integer (Display) - (id) showint { printf(“%d\n”, [self integer]); return self; } @end In the second fi le, we see the fi rst call of an object’s method. [self integer] is an example of the way methods are called in Objective-C. This is roughly equivalent to self.integer() in C++. Here are two more, slightly more compli- cated fi les: #import “Integer.h” @interface Integer (Add_Mult) - (id) add_mult: (Integer *) addend with_multiplier: (int) mult; @end and #import “Add_Mult.h” @implementation Integer (Add_Mult) - (id) add_mult: (Integer *) addend with_multiplier:(int)mult { return [self set_integer: [self get_integer] + [addend get_integer] * mult ]; } @end 12 Part I ■ Mac OS X Basics These two fi les show how multiple parameters are passed to a function. A label, in this case with_multiplier, can be added to the additional parameters. The method is referred to as add_mult:with_multiplier:. The following code shows how to call a function requiring multiple parameters. #include #import “Integer.h” #import “Add_Mult.h” #import “Display.h” int main(int argc, char *argv[]) { Integer *num1 = [Integer new], *num2 = [Integer new]; [num1 integer:atoi(argv[1])]; [num2 integer:atoi(argv[2])]; [num1 add_mult:num2 with_multiplier: 2]; [num1 showint]; } Building this is as easy as invoking gcc with an additional argument. $ gcc -g -x objective-c main.m Integer.m Add_Mult.m Display.m -lobjc Running the program shows that it can indeed add a number multiplied by two. $ ./a.out 1 4 9 As a sample of things to come, consider the disassembled version of the add_mult:with_multiplier: function. 0x1f02 push ebp 0x1f03 mov ebp,esp 0x1f05 push edi 0x1f06 push esi 0x1f07 push ebx 0x1f08 sub esp,0x1c 0x1f0b call 0x1f10 0x1f10 pop ebx 0x1f11 mov edi,DWORD PTR [ebp+0x8] 0x1f14 mov edx,DWORD PTR [ebp+0x8] 0x1f17 lea eax,[ebx+0x1100] 0x1f1d mov eax,DWORD PTR [eax] 0x1f1f mov DWORD PTR [esp+0x4],eax 0x1f23 mov DWORD PTR [esp],edx 0x1f26 call 0x400a 0x1f2b mov esi,eax Chapter 1 ■ Mac OS X Architecture 13 0x1f2d mov edx,DWORD PTR [ebp+0x10] 0x1f30 lea eax,[ebx+0x1100] 0x1f36 mov eax,DWORD PTR [eax] 0x1f38 mov DWORD PTR [esp+0x4],eax 0x1f3c mov DWORD PTR [esp],edx 0x1f3f call 0x400a 0x1f44 imul eax,DWORD PTR [ebp+0x14] 0x1f48 lea edx,[esi+eax] 0x1f4b lea eax,[ebx+0x10f8] 0x1f51 mov eax,DWORD PTR [eax] 0x1f53 mov DWORD PTR [esp+0x8],edx 0x1f57 mov DWORD PTR [esp+0x4],eax 0x1f5b mov DWORD PTR [esp],edi 0x1f5e call 0x400a 0x1f63 add esp,0x1c 0x1f66 pop ebx 0x1f67 pop esi 0x1f68 pop edi 0x1f69 leave 0x1f6a ret Looking at this, it is tough to imagine what this function does. While there is an instruction for the multiplication (imul), there is no addition occurring. You’ll also see that, typical of an Objective-C binary, almost every function call is to objc_msgSend, which can make it diffi cult to know what is going on. There is also the strange call instruction at address 0×1f0b which calls the next instruction. These problems (along with some solutions) will be addressed in more detail in Chapter 6, “Reverse Engineering.” Universal Binaries and the Mach-O File Format Applications and libraries in Mac OS X use the Mach-O (Mach object) fi le for- mat and may come ready for different architectures, which are called universal binaries. Universal Binaries For legacy support, many binaries in Leopard are universal binaries. A universal binary can support multiple architectures in the same fi le. For Mac OS X, this is usually PowerPC and x86. $ fi le /bin/ls /bin/ls: Mach-O universal binary with 2 architectures /bin/ls (for architecture i386): Mach-O executable i386 /bin/ls (for architecture ppc7400): Mach-O executable ppc 14 Part I ■ Mac OS X Basics Each universal binary has the code necessary to run on any of the architec- tures it supports. The same exact ls binary from the code example can run on a Mac with an x86 processor or a PowerPC processor. The obvious drawback is fi le size, of course. The gcc compiler in Mac OS X emits Mach-O-format binaries by default. To build a universal binary, one additional fl ag must be passed to specify the target architectures desired. In the following example, a universal binary for the x86 and PowerPC architectures is created. $ gcc -arch ppc -arch i386 -o test-universal test.c $ file test-universal test-universal: Mach-O universal binary with 2 architectures test-universal (for architecture ppc7400): Mach-O executable ppc test-universal (for architecture i386): Mach-O executable i386 To see the fi le-size difference, compare this binary to the single-architecture version: -rwxr-xr-x 1 user1 user1 12564 May 1 12:55 test -rwxr-xr-x 1 user1 user1 28948 May 1 12:54 test-universal Mach-O File Format This fi le format supports both statically and dynamically linked executables. The basic structure contains three regions: the header, the load commands, and the actual data. The header contains basic information about the fi le, such as magic bytes to identify it as a Mach-O fi le and information about the target architecture. The following is the structure from the header, compliments of the /usr/include/ mach-o/loader.h fi le. struct mach_header{ uint32_t magic; cpu_type_t cputype; cpu_subtype_t cpusubtype; uint32_t filetype; uint32_t ncmds; uint32_t sizeofcmds; uint32_t flags; }; The magic number identifi es the fi le as Mach-O. The cputype will probably be either PowerPC or I386. The cpusubtype can specify specifi c models of CPU on which to run. The fi letype indicates the usage and alignment for the fi le. 16 Part I ■ Mac OS X Basics fat_magic 0xcafebabe nfat_arch 2 architecture 0 cputype 7 cpusubtype 3 capabilities 0x0 offset 4096 size 36464 align 2^12 (4096) architecture 1 cputype 18 cpusubtype 10 capabilities 0x0 offset 40960 size 32736 align 2^12 (4096) Looking at /usr/include/mach/machine.h, you can see that the fi rst architec- ture has cputype 7, which corresponds to CPU_TYPE_X86 and has a cpusubtype of CPU_SUBTYPE_386. Not surprisingly, the second architecture has values CPU_TYPE_POWERPC and CPU_SUBTYPE_POWERPC_7400, respectively. Next we can obtain the Mach header. $ otool -h /bin/ls /bin/ls: Mach header magic cputype cpusubtype caps filetype ncmds sizeofcmds flags 0xfeedface 7 3 0x00 2 14 1304 0x00000085 In this case, we again see the cputype and cpusubtype. The fi letype is MH_ EXECUTE and there are 14 load commands. The fl ags work out to be MH_ NOUNDEFS | MH_DYLDLINK | MH_TWOLEVEL. Moving on, we see some of the load commands for this binary. $ otool -l /bin/ls /bin/ls: Load command 0 cmd LC_SEGMENT cmdsize 56 segname __PAGEZERO vmaddr 0x00000000 vmsize 0x00001000 fileoff 0 filesize 0 maxprot 0x00000000 initprot 0x00000000 nsects 0 flags 0x0 Load command 1 Chapter 1 ■ Mac OS X Architecture 17 cmd LC_SEGMENT cmdsize 260 segname __TEXT vmaddr 0x00001000 vmsize 0x00005000 fileoff 0 filesize 20480 maxprot 0x00000007 initprot 0x00000005 nsects 3 flags 0x0 Section sectname __text segname __TEXT addr 0x000023c4 size 0x000035df offset 5060 align 2^2 (4) reloff 0 nreloc 0 flags 0x80000400 reserved1 0 reserved2 0 … Bundles In Mac OS X, shared resources are contained in bundles. Many kinds of bundles contain related fi les, but we’ll focus mostly on application and frame- work bundles. The types of resources contained within a bundle may consist of applications, libraries, images, documentation, header fi les, etc. Basically, a bundle is a directory structure within the fi le system. Interestingly, by default this directory looks like a single object in Finder. $ ls -ld iTunes.app drwxrwxr-x 3 root admin 102 Apr 4 13:15 iTunes.app This naive view of fi les can be changed within Finder by selecting Show Package Contents in the Action menu, but you probably use the Terminal appli- cation rather than Finder, anyway. Within application bundles, there is usually a single folder called Contents. We’ll give you a quick tour of the QuickTime Player bundle. $ ls /Applications/QuickTime\ Player.app/Contents/ CodeResources Info.plist PkgInfo Resources Frameworks MacOS PlugIns version.plist 18 Part I ■ Mac OS X Basics The binary itself is within the MacOS directory. If you want to launch the program through the command line or a script, you will likely have to refer to the following binary, for example. $ /Applications/QuickTime\ Player.app/Contents/MacOS/QuickTime\ Player The Resources directory contains much of the noncode, such as images, mov- ies, and icons. The Frameworks directory contains the associated framework bundles, in this case DotMacKit. Finally, there is a number of plist, or property list, fi les. Property-list fi les contain confi guration information. A plist fi le may contain user-specifi c or system-wide information. Plist fi les can be either in binary or XML format. The XML versions are relatively straightforward to read. The fol- lowing is the beginning of the Info.plist fi le from QuickTime Player. CFBundleDevelopmentRegionEnglishCFBundleDocumentTypesCFBundleTypeExtensionsaacadtsCFBundleTypeMIMETypesaudio/aacaudio/x-aacCFBundleTypeNameAudio-AACCFBundleTypeRoleViewerNSDocumentClassQTPMovieDocumentNSPersistentStoreTypeKeyBinary Chapter 1 ■ Mac OS X Architecture 19 Many of the keys and their meaning can be found at http://developer .apple.com/documentation/MacOSX/Conceptual/BPRuntimeConfig/Articles/ PListKeys.html. Here is a quick description of those found in the excerpt: CFBundleDevelopmentRegion: The native region for the bundle ■ CFBundleDocumentTypes: The document types supported by the ■ bundle CFBundleTypeExtensions: File extension to associate with this docu- ■ ment type CFBundleTypeMIMETypes: MIME type name to associate with this ■ document type CFBundleTypeName: An abstract (and unique) way to refer to the docu- ■ ment type CFBundleTypeRole: The application’s role with respect to this docu- ■ ment type; possibilities are Editor, Viewer, Shell, or None NSDocumentClass: Legacy key for Cocoa applications ■ NSPersistentStoreTypeKey: The Core Data type ■ Many of these will be important later, when we’re identifying the attack surface in Chapter 3, “Attack Surface.” It is possible to convert this XML plist into a binary plist using plutil, or vice versa. $ plutil -convert binary1 -o Binary.Info.plist Info.plist $ plutil -convert xml1 -o XML.Binary.Info.plist Binary.Info.plist $ file *Info.plist Binary.Info.plist: Apple binary property list Info.plist: XML 1.0 document text XML.Binary.Info.plist: XML 1.0 document text $ md5sum XML.Binary.Info.plist Info.plist de13b98c54a93c052050294d9ca9d119 XML.Binary.Info.plist de13b98c54a93c052050294d9ca9d119 Info.plist Here we fi rst converted QuickTime Player’s Info.plist to binary format. We then converted it back into XML format. The fi le command shows the conversion has occurred and md5sum confi rms that the conversion is precisely reversible. launchd Launchd is Apple’s replacement for cron, xinetd, init, and others. It was intro- duced in Mac OS X v10.4 (Tiger) and performs tasks such as initializing systems, running startup programs, etc. It allows processes to be started at various times or when various conditions occur, and ensures that particular processes are always running. It handles daemons at both the system and user level. 20 Part I ■ Mac OS X Basics The systemwide launchd configuration files are stored in the /System/ Library/LaunchAgents and /System/Library/LaunchDaemons directories. User-specifi c fi les are in ~/Library/LaunchAgents. The difference between daemons and agents is that daemons run as root and are intended to run in the background. Agents are run with the privileges of a user and may run in the foreground; they can even include a graphical user interface. Launchctl is a command-line application used to load and unload the daemons. The confi guration fi les for launchd are, not surprisingly, plists. We’ll show you how one works. Consider the fi le com.apple.PreferenceSyncAgent.plist. Labelcom.apple.PreferenceSyncAgentProgramArguments/System/Library/CoreServices/ PreferenceSyncClient.app/Contents/MacOS/PreferenceSyncClient--sync--periodicStartInterval3599 This plist uses three keys. The Label key identifies the job to launchd. ProgramArguments is an array consisting of the application to run as well as any necessary command-line arguments. Finally, StartInterval indicates that this process should be run every 3,599 seconds, or just more than once an hour. Other keys that might be of interest include UserName: Indicates the user to run the job as ■ OnDemand: Indicates whether to run the job when asked or keep it ■ running all the time StartCalendarInterval: Provides cron-like launching of applications at ■ various times Why should you care about this? Well, there are a few times it might be handy. One is when breaking out of a sandbox, which we’ll discuss later in this chapter. Another is in when providing automated processing needed in fuzzing, which we’ll discuss more in Chapter 4’s section “In-Memory Fuzzing.” For example, consider the following plist fi le. Chapter 1 ■ Mac OS X Architecture 21 Labelcom.apple.KeepSafariAliveProgramArguments/Applications/Safari.app/Contents/MacOS/Safari < /string> OnDemand Save this to a fi le called ~/Library/LaunchAgents/com.apple.KeepSafariAlive. plist. Then start it up with $ launchctl load Library/LaunchAgents/com.apple.KeepSafariAlive.plist This should start up Safari. Imagine a situation in which fuzzing is occur- ring while you’re using a Meta refresh tag from Safari’s default home page. The problem is that when Safari inevitably crashes, the fuzzing will stop. The solution is the preceeding launchd fi le, which restarts it automatically. Give it a try, and pretend the fuzzing killed Safari. $ killall -9 Safari The launchd agent should respawn Safari automatically. To turn off this launchd job, issue the following command: $ launchctl unload Library/LaunchAgents/com.apple.KeepSafariAlive.plist Leopard Security Since we’re talking about Mac OS X in general, we should talk about security features added to Leopard. This section covers some topics of interest from this fi eld. Some of these address new features of Leopard while others are merely updates to topics relevant to the security of the system. 22 Part I ■ Mac OS X Basics Library Randomization There are two steps to attacking an application. The fi rst is to fi nd a vulner- ability. The second is to exploit it in a reliable manner. There seems to be no end to vulnerabilities in code. It is very diffi cult to eliminate all the bugs from an old code base, considering that a vulnerability may present itself as a missing character in one line out of millions of lines of source code. Therefore, many vendors have concluded that vulnerabilities are inevitable, but they can at least make exploitation diffi cult if not impossible to accomplish. Beginning with Leopard, one anti-exploitation method Mac OS X employs is library randomization. Leopard randomizes the addresses of most librar- ies within a process address space. This makes it harder for an attacker to get control, as they can not rely on these addresses being the same. Nevertheless, Leopard still does not randomize many elements of the address space. Therefore we prefer not to use the term address space layout randomization (ASLR) when referring to Leopard. In true ASLR, the locations of the executable, libraries, heap, and stack are all randomized. As you’ll see shortly, in Leopard only the location of (most of) the libraries is randomized. Unfortunately for Apple, just as one bug is enough to open a system to attacks, leaving anything not random- ized is often enough to allow a successful attack, and this will be demonstrated in Chapters 7, 8, and 10. By way of comparison, Windows is often criticized for not forcing third-party applications (such as Java) to build their libraries to be compatible with ASLR. In Leopard, library randomization is not possible even in the Apple binaries! Leopard’s library randomization is not well documented, but critical informa- tion on the topic can be found in the /var/db/dyld directory. For example, the map of where different libraries should be loaded is in the dyld_shared_cache_ i386.map fi le in this directory. An example of this fi le’s contents is provided in the code that follows. Obviously, the contents of this fi le will be different on different systems; however, the contents do not change upon reboot. This fi le may change when the system is updated. The fi le is updated when the update_dyld_shared_cache program is run. Since the location in which the libraries are loaded is fi xed for extended periods of time for a given system across all processes, the library randomization implemented by Leopard does not help prevent local-privilege escalation attacks. /usr/lib/system/libmathCommon.A.dylib __TEXT 0x945B3000 -> 0x945B8000 __DATA 0xA0679000 -> 0xA067A000 __LINKEDIT 0x9735F000 -> 0x9773D000 /System/Library/Frameworks/Quartz.framework/Versions/ A/Frameworks/ImageKit.framework/Versions/A/ImageKit __TEXT 0x945B8000 -> 0x946F0000 __DATA 0xA067A000 -> 0xA0682000 Chapter 1 ■ Mac OS X Architecture 23 __OBJC 0xA0682000 -> 0xA06A6000 __IMPORT 0xA0A59000 -> 0xA0A5A000 __LINKEDIT 0x9735F000 -> 0x9773D000 This excerpt from the dyld_shared_cache_i386.map fi le shows where two libraries, libmathCommon and ImageKit, will be loaded in memory on this system. To get a better idea of how Leopard’s randomization works (or doesn’t), con- sider the following simple C program. #include #include void foo(){ ; } int main(int argc, char *argv[]){ int y; char *x = (char *) malloc(128); printf(“Lib function: %08x, Heap: %08x, Stack: %08x, Binary: %08x\n”, &malloc, x, &y, &foo); } This program prints out the address of the malloc() routine located within libSystem. It then prints out the address of a malloced heap buffer, of a stack buffer, and, fi nally, of a function from the application image. Running this pro- gram on one computer (even after reboots) always reveals the same numbers; however, running this program on different machines shows some differences in the output. The following is the output from this program run on fi ve dif- ferent Leopard computers. Lib function: 920d7795, Heap: 00100120, Stack: bffff768, Binary: 00001f66 Lib function: 9120b795, Heap: 00100120, Stack: bffffab8, Binary: 00001f66 Lib function: 93809795, Heap: 00100120, Stack: bffff9a8, Binary: 00001f66 Lib function: 93d9e795, Heap: 00100120, Stack: bffff8d8, Binary: 00001f66 Lib function: 96841795, Heap: 00100120, Stack: bffffa38, Binary: 00001f66 This demonstrates that the addresses to which libraries are loaded are indeed randomized from machine to machine. However, the heap and the applica- tion image clearly are not, in this case at least. The small amount of variation in the location of the stack buffer can be attributed to the stack containing 24 Part I ■ Mac OS X Basics the environment for the program, which will differ depending on the user’s confi guration. The stack location is not randomized. So while some basic ran- domization occurs, there are still signifi cant portions of the memory that are not random, and, in fact, are completely predictable. We’ll show in Chapters 7 and 8 how to defeat this limited randomization. Executable Heap Another approach to making exploitation more diffi cult is to make it hard to execute injected code within a process—i.e., hard to execute shellcode. To do this, it is important to make as much of the process space nonexecutable as possible. Obviously, some of the space must be executable to run programs, but making the stack and heap nonexecutable can go a long way toward making exploitation diffi cult. This is the idea behind Data Execution Prevention (DEP) in Windows and W^X in OpenBSD. Before we dive into an explanation of memory protection in Leopard, we need fi rst to discuss hardware protections. For x86 processors, Apple uses chips from Intel. Intel uses the XD bit, or Execute Disable bit, stored in the page tables to mark areas of memory as nonexecutable. (In AMD processors, this is called the NX bit for No Execute.) Any section of memory with the XD bit set can be used only for reading or writing data; any attempt to execute code from this memory will cause a program crash. In Mac OS X, the XD bit is set on all stack memory, thus preventing execution from the stack. Consider the following program that attempts to execute where the XD bit is set. #include #include #include char shellcode[] = “\xeb\xfe”; int main(int argc, char *argv[]){ void (*f)(); char x[4]; memcpy(x, shellcode, sizeof(shellcode)); f = (void (*)()) x; f(); } Running this program shows that it crashes when it attemps to exeucte on the stack $ ./stack_executable Segmentation fault Chapter 1 ■ Mac OS X Architecture 25 This same program will execute on a Mac running on a PPC chip (although the shellcode will be wrong, of course), since the stack is executable in that architecture. The stack is in good shape, but what about the heap? A quick look with the vmmap utility shows that the heap is read/write only. ==== Writable regions for process 12137 __DATA 00002000-00003000 [ 4K] rw-/rwx SM=COW foo __IMPORT 00003000-00004000 [ 4K] rwx/rwx SM=COW foo MALLOC (freed?) 00006000-00007000 [ 4K] rw-/rwx SM=PRV MALLOC_TINY 00100000-00200000 [ 1024K] rw-/rwx SM=PRV DefaultMallocZone_0x100000 __DATA 8fe2e000-8fe30000 [ 8K] rw-/rwx SM=COW /usr/lib/dyld __DATA 8fe30000-8fe67000 [ 220K] rw-/rwx SM=PRV /usr/lib/dyld __DATA a052e000-a052f000 [ 4K] rw-/rw- SM=COW /usr/lib/system/libmathCommon.A.dylib __DATA a0550000-a0551000 [ 4K] rw-/rw- SM=COW /usr/lib/libgcc_s.1.dylib shared pmap a0600000-a07e5000 [ 1940K] rw-/rwx SM=COW __DATA a07e5000-a083f000 [ 360K] rw-/rwx SM=COW /usr/lib/libSystem.B.dylib shared pmap a083f000-a09ac000 [ 1460K] rw-/rwx SM=COW Stack bf800000-bffff000 [ 8188K] rw-/rwx SM=ZER Stack bffff000-c0000000 [ 4K] rw-/rwx SM=COW thread 0 Leopard does not set the XD bit on any parts of memory besides the stack. It is unclear if this is a bug, an oversight, or intentional, but even if the software’s memory permissions are set to be nonexecutable, you can still execute anywhere except the stack. The following simple program illustrates that point. #include #include #include char shellcode[] = “\xeb\xfe”; int main(int argc, char *argv[]){ void (*f)(); char *x = malloc(2); memcpy(x, shellcode, sizeof(shellcode)); f = (void (*)()) x; f(); } 26 Part I ■ Mac OS X Basics This program copies some shellcode (in this case a simple infi nite loop) onto the heap and then executes it. It runs fi ne, and with a debugger you can verify that it is indeed executing within the heap buffer. Taking this one step further, we can explicitly set the heap buffer to be nonexecutable and still execute there. #include #include #include #include char shellcode[] = “\xeb\xfe”; int main(int argc, char *argv[]){ void (*f)(); char *x = malloc(2); unsigned int page_start = ((unsigned int) x) & 0xfffff000; int ret = mprotect((void *) page_start, 4096, PROT_READ | PROT_ WRITE); if(ret<0){ perror(“mprotect failed”); } memcpy(x, shellcode, sizeof(shellcode)); f = (void (*)()) x; f(); } Amazingly, this code still executes fi ne. Furthermore, even the stack protec- tions can be overwritten with a call to mprotect. #include #include #include #include char shellcode[] = “\xeb\xfe”; int main(int argc, char *argv[]){ void (*f)(); char x[4]; memcpy(x, shellcode, sizeof(shellcode)); f = (void (*)()) x; mprotect((void *) 0xbffff000, 4092, PROT_READ | PROT_WRITE | PROT_EXEC); f(); } This might be a possible avenue of attack in a return-to-libc attack. So, to summarize, within Leopard it is possible to execute code anywhere in a process besides the stack. Furthermore, it is possible to execute code on the stack after a call to mprotect. Chapter 1 ■ Mac OS X Architecture 27 Stack Protection (propolice) Although you would think stack overfl ows are a relic of the past, they do still arise, as you’ll see in Chapter 7, “Exploring Stack Overfl ows.” An operating sys- tem’s designers need to worry about making stack overfl ows diffi cult to exploit; otherwise, the exploitation of overfl ows is entirely trivial and reliable. With this in mind, the GCC compiler that comes with Leopard has an option called -fstack-protector that sets a value on the stack, called a canary. This value is randomly set and placed between the stack variables and the stack metadata. Then, before a function returns, the canary value is checked to ensure it hasn’t changed. In this way, if a stack buffer overfl ow were to occur, the important metadata stored on the stack, such as the return address and saved stack pointer, could not be corrupted without fi rst corrupting the canary. This helps protect against simple stack-based overfl ows. Consider the following program. int main(int argc, char *argv[]){ char buf[16]; strcpy(buf, argv[1]); } This contains an obvious stack-overfl ow vulnerability. Normal execution causes an exploitable crash. $ gdb ./stack_police GNU gdb 6.3.50-20050815 (Apple version gdb-768) (Tue Oct 2 04:07:49 UTC 2007) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type “show copying” to see the conditions. There is absolutely no warranty for GDB. Type “show warranty” for details. This GDB was configured as “i386-apple-darwin”… No symbol table is loaded. Use the “file” command. Reading symbols for shared libraries … done (gdb) set args AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA (gdb) r Starting program: /Users/cmiller/book/macosx-book/stack_police AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA Reading symbols for shared libraries ++. done Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: KERN_INVALID_ADDRESS at address: 0x41414141 0x41414141 in ?? () (gdb) 28 Part I ■ Mac OS X Basics Compiling with the propolice option, however, prevents exploitation. $ gcc -g -fstack-protector -o stack_police stack_police.c $ ./stack_police AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA Abort trap In this case, a SIGABRT signal was sent by the function that checks the canary’s value. This is a good protection against stack-overfl ow exploitation, but it helps only if it is used. Leopard binaries sometimes use it and sometimes don’t. Observe. $ nm QuickTime\ Player | grep stack U ___stack_chk_fail U ___stack_chk_guard $ nm /Applications/Safari.app/Contents/MacOS/Safari | grep stack Here, the nm tool (along with grep) is used to fi nd the symbols utilized in two applications: QuickTime Player and Safari. QuickTime Player contains the sym- bols that are used to validate the stack, whereas Safari does not. Therefore, the code within the main Safari executable does not have this protection enabled. It is important to note that when compiling, this stack protection will be used only when the option is used while compiling the specifi c source fi le in which the code is located. In other words, within a single application or library, there may be some functions with this protection enabled but others without the protection enabled. One fi nal note: It is possible to confuse propolice by smashing the stack com- pletely. Consider the previous sample program with 5,000 characters entered as the fi rst argument. (gdb) set args `perl -e ‘print “A”x5000’` (gdb) r Starting program: /Users/cmiller/book/macosx-book/stack_police `perl -e ‘print “A”x5000’` Reading symbols for shared libraries ++. done Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: KERN_INVALID_ADDRESS at address: 0x41414140 0x920df690 in strlen () (gdb) bt #0 0x920df690 in strlen () #1 0x92101927 in strdup () #2 0x92103947 in asl_set_query () #3 0x9211703e in asl_set () #4 0x92130511 in vsyslog () #5 0x921303e8 in syslog () #6 0x921b3ef1 in __stack_chk_fail () #7 0x00001ff7 in main (argc=1094795585, argv=0xbfffcfcc) at stack_police.c:4 Chapter 1 ■ Mac OS X Architecture 29 The stack-check failure handler, __stack_chk_fail(), calls syslog syslog(“error %s”, argv[0]);. We have overwritten the argv[0] pointer with our own value. This does not appear to be exploitable, but unexpected behavior in the stack-check failure handler is not a good sign. Firewall Theoretically, Leopard offers important security improvements in the form of its fi rewall. In Tiger the fi rewall was based on ipfw (IP fi rewall), the BSD fi rewall. The ports that are open were controlled by the application’s plist fi les. In Leopard, ipfw is still there but always has a single rule. $ sudo ipfw list 65535 allow ip from any to any Instead the fi rewall is truly application based and is controlled by /usr/ libexec/ApplicationFirewall/socketfi lterfw and the associated com.apple.nke .applicationfi rewall driver. Many issues with Leopard’s firewall prevent it from being a significant obstacle to attack. The fi rst is that it is not enabled by default. Obviously, if it is not on, it isn’t an issue for an attacker. The next is that it blocks only incoming connections. This means any Leopard box that had some services running and listening might be protected; however, out-of-the-box Macs don’t have many listening processes running, so this isn’t really an issue. If users were to turn on something extra, like fi le sharing, they would obviously allow connections through the fi rewall, too. As far as exploit payload goes, it is no more diffi cult to write a payload that connects out from the compromised host (allowed by the fi rewall) than to sit and wait for incoming connections (not allowed by the fi rewall). Regardless, it is hard to imagine a scenario in which the Leopard fi rewall would actually prevent an otherwise-successful attack from working. Instead, it is basically designed to prevent errant third-party applications from opening listening ports. Sandboxing (Seatbelt) Another security feature introduced in Leopard is the idea of sandboxing appli- cations with the kernel extension Seatbelt. This mechanism is based on the prin- ciple that your Web browser probably doesn’t need to access your address book and your media player probably doesn’t need to bind to a port. Seatbelt allows an application developer to explicitly allow or deny an application to perform particular actions. In this way, exploitation of a vulnerability in a particular application doesn’t necessarily provide complete access to the system. 30 Part I ■ Mac OS X Basics Currently the source code for this mechanism is not available, but by looking at and playing around with the XNU source code, it becomes clear how applica- tion sandboxing works. The documentation for it is scarce to nonexistent. At this point, this feature is not intended to be used by anyone but Apple engineers, as the following warning indicates. WARNING: The sandbox rule capabilities and syntax used in this fi le are currently an Apple SPI (System Private Interface) and are subject to change at any time without notice. Apple may in [the] future announce an offi cial public supported sandbox API, but until then Developers are cautioned not to build products that use or depend on the sandbox facilities illustrated here. With one exception, applications that are to be sandboxed need to explicitly call the function sandbox_init() to execute within a sandbox. All child processes of a sandboxed function also operate within the sandbox. This allows you to sandbox applications that do not explicitly call sandbox_init() by executing them from within an application in an existing sandbox. One of the parameters to the sandbox_init() function is the name of a profi le in which to execute. Available profi les include the following. kSBXProfi leNoInternet: TCP/IP networking is prohibited. ■ kSBXProfi leNoNetwork: All sockets-based networking is prohibited. ■ kSBXProfi leNoWrite: File-system writes are prohibited. ■ kSBXProfi leNoWriteExceptTemporary: File-system writes are restricted ■ to the temporary folder /var/tmp and the folder specified by the confstr(3) confi guration variable _CS_DARWIN_USER_TEMP_DIR. kSBXProfilePureComputation: All operating-system services are ■ prohibited. These profi les are statically compiled into the kernel. We will test some of these profi les in the following code by using the sandbox-exec command. For this command, these profi les are summoned by the terms nointernet, nonet, nowrite, write-tmp-only, and pure-computation. $ sandbox-exec -n nonet /bin/bash bash-3.2$ ping www.google.com bash: /sbin/ping: Operation not permitted bash-3.2$ exit $ sandbox-exec –n nowrite /bin/bash bash-3.2$ cat > foo bash: foo: Operation not permitted Here we demonstrate starting the bash shell with no networking allowed. We omit showing that all the local commands still work and jump straight to try- ing to use ping, which fails. Exiting out of that sandbox, we try out the nowrite Chapter 1 ■ Mac OS X Architecture 31 sandbox and demonstrate that we cannot write fi les even though normally it would be allowed. Additionally, it is possible to use a custom-written profi le. Although there is no documentation on how to write one of these profi les, there are quite a few well-documented examples in the /usr/share/sandbox directory from which to start. These fi les are written using syntax from the Scheme programming language and describe all the applications currently sandboxed. These applica- tions include krb5kdc ■ mDNSResponder ■ mdworker ■ named ■ ntpd ■ portmap ■ quicklookd ■ syslogd ■ update ■ xgridagentd ■ xgridagentd_task_nobody ■ xgridagentd_task_somebody ■ xgridcontrollerd ■ Take a look at a couple of these fi les. The fi rst is quicklookd. ;; ;; quicklookd - sandbox profile ;; Copyright (c) 2006-2007 Apple Inc. All Rights reserved. ;; ;; WARNING: The sandbox rules in this file currently constitute ;; Apple System Private Interface and are subject to change at any time and ;; without notice. The contents of this file are also auto-generated and not ;; user editable; it may be overwritten at any time. ;; (version 1) (allow default) (deny network-outbound) (allow network-outbound (to unix-socket)) (deny network*) (debug deny) 32 Part I ■ Mac OS X Basics This policy says that, by default, all actions are allowed except those that are specifi cally denied. In this case, network communication is denied, as the application doesn’t need it. Therefore, if this process were taken over by a remote attacker (say, by providing the victim with a malicious fi le), the process would not be able to open a remote socket back to the attacker. We’ll discuss a way around this in a moment. Another example is update.sb. (version 1) (debug deny) (allow process-exec (regex #”^/usr/sbin/update$”)) (allow sysctl-read) (allow file-read-data file-read-metadata (regex #”^/usr/lib/.*\.dylib$” #”^/var” #”^/private/var/db/dyld/” #”^/dev/urandom$” #”^/dev/dtracehelper$”)) (deny default) This policy denies all actions by default and allows only those explicitly needed. This is generally a safer approach. In this case, update can read fi les only from select directories. Now take a moment to see how this works on a test program. This program takes the name of a fi le from the command line and attempts to open it, read it, and print the results to the screen; i.e., it is a custom version of the cat utility. #include #include int main(int argc, char *argv[]){ int n; if(argc != 2){ printf(“./openfile filename\n”); exit(-1); } char buf[64]; FILE *f = fopen(argv[1], “r”); if(f==NULL){ perror(“Error opening file:”); exit(-1); } while(n = fread(buf, 1, 64, f)){ write(1, buf, n); } fclose(f); } Chapter 1 ■ Mac OS X Architecture 33 Consider the simple policy fi le. This fi le allows reading fi les only from /tmp. (version 1) (debug deny) (allow process-exec (regex #”openfile”)) (allow file-read-data file-read-metadata (regex #”^/usr/lib/.*\.dylib$” #”^/private/tmp” )) (deny default) We can see this policy being enforced by trying to read a fi le named hi, which contains only the single word “hi.” $ ./openfile hi hi $ sandbox-exec -f openfile.sb ./openfile hi Error opening file:: Permission denied $ sandbox-exec -f openfile.sb ./openfile /private/tmp/hi hi Here, the sandbox-exec binary is simply a wrapper that sets the sandbox and then executes the other program within the sandbox as a child. As you can see, the sandbox prevents reading from arbitrary directories, but still allows the application to read from the /tmp directory. It should be noted that sandboxes are not a cure-all. For instance, in the quicklookd example, network connections are denied but anything else is per- mitted. One way to achieve network access is to write a fi le to be executed to the fi lesystem—perhaps a script that sets up a reverse shell—then confi gure launchd to start it for you. As launchd is not in the sandbox, there will be no restrictions on this new application. This is one example of circumventing the sandbox. Additionally, it is diffi cult to effectively sandbox an application like Safari. This application makes arbitrary connections to the Internet, reads and writes to a variety of fi les (consider the fi le:// URI handler as well as the fact a user can use the Save As option from the pull down menu) and executes a vari- ety of applications (through various URI handlers such as ssh://, vnc://, etc). Therefore, it will be hard to write a policy that signifi cantly hinders an attacker who gains control of the Safari process. One fi nal note is that the Apple-authored software that runs on Windows doesn’t have additional security precautions, such as application sandboxing. When you download iTunes for Windows so that you can sync your iPhone, you open yourself up to a remote attack against the mDNSResponder running on your system without its protective sandbox. 34 Part I ■ Mac OS X Basics References http://www.matasano.com/log/986/what-weve-since-learned-about- leopard-security-features/ http://www.usefulsecurity.com/2007/11/apple-sandboxes-part-2/ http://developer.apple.com/opensource/index.html http://www.amazon.com/Mac-OS-Internals-Systems-Approach/ dp/0321278542 http://uninformed.org/index.cgi?v=4&a=3&p=17 http://cve.mitre/org/cgi-bin/cvema,e.cgi?name=2006-4392 http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2007-3749 http://www.otierney.net/objective-c.html blog.nearband.com/2007/11/12/first-impressions-of-leopard# 36 Part I ■ Mac OS X Basics The Internet Engineering Task Force (IETF) Zero Confi guration Networking Working Group specifies three requirements for Zero Configuration Networking, such as Bonjour provides. Must be able to obtain an IP Address (even without a DHCP ■ server) Must be able to do name-to-address translation (even without a DNS ■ server) Must be able to discover services on the network ■ Get an IP Address The fi rst requirement is met via RFC 3927, Dynamic Confi guration of IPv4 Link-Local Addresses (or RFC 2496 for IPv6). The basic idea is to have a device try to get an IP address in the range 169.254/16. The device selects an address from this range randomly. It then tests whether that IP address is already in use by issuing a series of Address Resolution Protocol (ARP) requests for that IP address (Figure 2-1). If an ARP reply is received, the device selects a new IP address randomly and begins again. Otherwise it has found its IP address. There are some additional stipulations for the unusual case in which other devices select this device’s IP address or a race condition occurs, but the basic idea is simple enough. This RFC is the document that explains why when your network is messed up, your computer gets an IP address in the range 169.254/16! Figure 2-1: A packet capture of a device trying to see whether any other device has the address it chose Chapter 2 ■ Mac OS X Parlance 37 In fact, all Macs keep an entry in their routing table in case a device shows up on this subnet. $ netstat -rn | grep 169 169.254 link#4 UCS 0 0 en0 Set Up Name Translation The second requirement is met by using Multicast DNS (mDNS). Multicast DNS is, not surprisingly, similar to DNS. The mDNS protocol uses the same packet format, name structure, and DNS record types as unicast DNS. The primary dif- ference is that its queries are sent to all local hosts using multicast. By contrast, DNS queries are sent to a specifi c, preconfi gured host, the name server. Another difference is that DNS listens on UDP port 53, while mDNS lis- tens on UDP port 5353. Multicast DNS requests use the multicast address 224.0.0.251. Any machine running Bonjour listens for these multicast requests, and, if it knows the answer, it replies, usually to a multicast address. In this way, machines on the local network can continuously update their cache without making any requests. This explains how devices can fi nd out the IP address of named devices, but does not explain how these devices come up with their own names. For this, the strategy is similar to how IP addresses are derived. The device chooses a name that ends in .local, usually based on the hostname, but it could also be chosen randomly. It then makes mDNS queries for any other machine with that name. If it fi nds another device with that name, it chooses a different name; otherwise it has found its name (Figure 2-2). Note that in this way, all mDNS names end in the string .local. Many operating systems, including Mac OS X and Windows (even without Bonjour installed) support mDNS names. Figure 2-2: A packet capture showing mDNS name resolution. 38 Part I ■ Mac OS X Basics Service Discovery The fi nal requirement of Zero Confi guration Networking is met by DNS Service Discovery (DNS-SD). DNS Service Discovery uses the syntax from DNS SRV records, but uses DNS PTR records so that multiple results can be returned if more than one host offers a particular service. A client requests the PTR lookup for the name “.” and receives a list of zero or more PTR records of the form “..”. An example will help clear this up. Mac OS X comes with the dns-sd binary, which can be used to advertise services and perform lookups for services. To look for available SSH servers (Figure 2-3) on the local network, the following command can be issued, where in this case the service is ssh and the domain is tcp. $ dns-sd -B _ssh._tcp Browsing for _ssh._tcp Timestamp A/R Flags if Domain Service Type Instance Name 9:13:46.475 Add 3 4 local. _ssh._tcp. Charlie Miller’s Computer 9:13:46.475 Add 2 4 local. _ssh._tcp. Dragos Ruiu’s MacBook Air ^C In the packet structure, the packets look just like DNS queries except they are on port 5353 and they are sent to a multicast address. For another example, dns-sd can be run in one window looking for web pages, and in another it can advertise the fact that a service is available. $ dns-sd -B _http._tcp Browsing for _http._tcp Timestamp A/R Flags if Domain Service Type Instance Name 9:52:51.203 Add 2 4 local. _http._tcp. DVR 887A This shows an existing HTTP service called DVR 887A already on the net- work. This happens to be a TiVo. In another window, dns-sd can be used to advertise a service: $ dns-sd -R “Index” _http._tcp . 80 path=/index.html Registering Service Index._http._tcp port 80 TXT path=/index.html 9:53:03.998 Got a reply for service Index._http._tcp.local.: Name now registered and active Chapter 2 ■ Mac OS X Parlance 39 This command registers an HTTP service on port 80. Notice that the machine doesn’t actually have such a service, but dns-sd is free to send the packets that indicate that such a service exists. The original dns-sd command sees this new service available and adds it. 9:53:04.250 Add 3 4 local. _http._tcp. Index You can see how quickly this information is propagated; it took .25 seconds for the listener to add the new service after it was added. This is because the new service, upon starting, mulitcasts its presence to everyone on the subnet. The listener didn’t have to ask; it just had to be listening. This helps keep the level of network traffi c for Bonjour to a minimum. If you kill the advertising of the HTTP service from the second window by pressing Ctrl+C, the original window sees it going away and removes it. 9:53:13.066 Rmv 1 4 local. _http._tcp. Index Figure 2-3: Packet capture for an SSH service query 40 Part I ■ Mac OS X Basics Bonjour Some administrators perceive Bonjour as a security risk because it advertises available services. This perception is a fallacy. Advertising services doesn’t make the services any more or less vulnerable. An attacker could still actively probe for services. If you really want to turn off Bonjour, you can use the following command to disable it. $ sudo launchctl unload -w /System/Library/LaunchDaemons/com.apple.mDNSResponder.plist If you are worried about the mDNSResponder service itself having a vulner- ability, then this might be a smart command to run. Another way to view Bonjour activity on the network is with Bonjour Browser (www.tildesoft.com); see Figure 2-4. Figure 2-4: Bonjour Browser shows all advertised services. You can see some of the service names, such as _odisk, _tivo-videos, _http, _ssh, and _workstation. o_disk is the remote disk sharing used by Mac OS X to share out a DVD or CD-ROM drive. Another way to interact with Bonjour is programmatically through Python. There are Python bindings for all Zero Configuration settings from the Chapter 2 ■ Mac OS X Parlance 41 pyzeroconf package (sourceforge.net/projects/pyzeroconf). For example, the following Python script performs the same actions as the dns-sd command executed earlier. import Zeroconf class MyListener(object): def removeService(self, server, type, name): print “Service”, repr(name), “removed” def addService(self, server, type, name): print “Service”, repr(name), “added” # Request more information about the service try: info = server.getServiceInfo(type, name) print ‘Additional info:’, info except: pass if __name__ == ‘__main__’: server = Zeroconf.Zeroconf() listener = MyListener() browser = Zeroconf.ServiceBrowser(server, “_ssh._tcp.local.”, listener) Running this script gives the location of advertised SSH servers on this local network. $ python query.py Service u”Charlie Miller’s Computer._ssh._tcp.local.” added Additional info: service[Charlie Miller’s Computer._ssh._tcp.local.,192.168.1.182:22,] Service u’Dragos Ruiu\u2019s MacBook Air._ssh._tcp.local.’ added mDNSResponder Now that you understand how Bonjour works in practice, it may be useful to look at the source code for mDNSResponder. This is the application responsible for handling Bonjour on Mac OS X computers and is one of the only listening services in Mac OS X out of the box. This application had the honor of pos- sessing the fi rst out-of-the-box remote root in OS X (this vulnerability could be activated across the Internet, even if the fi rewall confi g was turned on and set to its most restrictive settings possible using the GUI). For these reasons, it deserves a closer look. 42 Part I ■ Mac OS X Basics To get the source code, go to Apple’s CVS server. $ export CVSROOT=:ext:apsl@anoncvs.opensource.apple.com:/cvs/apsl $ export CVS_RSH=ssh $ cvs co mDNSResponder It will ask for a password. Use your Apple ID and password separated by a colon, like id:pass. Take a look at the directory structure. $ ls CVS PrivateDNS.txt mDNSMacOS9 mDNSShared Clients README.txt mDNSMacOSX mDNSVxWorks LICENSE buildResults.xml mDNSPosix mDNSWindows Makefile mDNSCore mDNSResponder.sln There is a central location of code for all platforms (mDNSShared), as well as platform-specifi c directories (such as mDNSMacOSX and mDNSWindows). These platform-specifi c fi les contain information about the application’s low- level needs, such as how to send and receive UDP packets or how to join a multicast group. There is also a Visual Studio fi le for building in a Windows environment and an Xcode project fi le that is invoked by the Makefi le. As this is the fi rst time you’ve encountered the need to use Xcode, we’ll take a moment to explain Xcode projects. A Digression about Xcode Xcode is Apple’s Integrated Development Environment (IDE). It is free to down- load and comes on the Mac OS X installation DVD (although it is not installed by default). It consists of a sophisticated GUI built on top of the GCC compiler. You can open an Xcode project by double-clicking on it in Finder or by using the Open command: $ open mDNSMacOSX/mDNSResponder.xcodeproj This command will bring up the main Xcode window; see Figure 2-5. You can use this GUI to change the confi gurations, edit and view source fi les, or even build the application. In this case, let’s make some changes to how the project is built. We will make it easier to debug by adding symbols and removing optimizations. Select Project ➢ Edit Project Settings. In the window that appears, select the Build tab. This tab controls all the settings that are normally passed as options to the compiler. In the search box, type debug. This will bring up all the confi guration settings related to debugging. Change the optimization to O0, and make sure the binary is not stripped and that debugging symbols are produced. Make the necessary changes, as in Figure 2-6, and close the Xcode project. Chapter 2 ■ Mac OS X Parlance 43 Figure 2-5: The Xcode project for mDNSResponder Figure 2-6: Changes to make a debug version of mDNSResponder 44 Part I ■ Mac OS X Basics Build the project by typing SRCROOT=. make or use the xcodebuild command-line interface: $ xcodebuild install -target mDNSResponder For the majority of projects, running xcodebuild without any arguments in the same directory as the corresponding .xcodeproj fi le will build the project. To start over, you can run the equivalent of “make clean”: $ xcodebuild clean When the project is built successfully, many libraries and binaries will be produced, including mDNSMacOSX/usr/sbin/mDNSResponder. To run this, make a copy of the real mDNSResponder and put the freshly built one on top of the old one. Then kill the mDNSResponder process; a new one will be spawned automatically. $ sudo mv /usr/sbin/mDNSResponder /usr/sbin/mDNSResponder.bak $ sudo cp mDNSMacOSX/usr/sbin/mDNSResponder /usr/sbin/ $ sudo chmod 555 /usr/sbin/mDNSResponder $ sudo killall -9 mDNSResponder Source Code Due to the importance of this application, and to get a feeling for Apple source code in general, we’ll now take a closer look at some of the source code from the project. We’ll concentrate on the code that is shared for all the plat- forms, located in mDNSCore. From a security perspective, it is important to know where untrusted network data enters the application. This occurs in the mDNSCoreReceive function from the fi le mDNS.c. mDNSexport void mDNSCoreReceive(mDNS *const m, void *const pkt, const mDNSu8 *const end, const mDNSAddr *const srcaddr, const mDNSIPPort srcport, const mDNSAddr *dstaddr, const mDNSIPPort dstport, const mDNSInterfaceID InterfaceID) { mDNSInterfaceID ifid = InterfaceID; DNSMessage *msg = (DNSMessage *)pkt; const mDNSu8 StdQ = kDNSFlag0_QR_Query | kDNSFlag0_OP_StdQuery; const mDNSu8 StdR = kDNSFlag0_QR_Response | Chapter 2 ■ Mac OS X Parlance 45 kDNSFlag0_OP_StdQuery; const mDNSu8 UpdR = kDNSFlag0_QR_Response | kDNSFlag0_OP_Update; mDNSu8 QR_OP; mDNSu8 *ptr = mDNSNULL; mDNSBool TLS = (dstaddr == (mDNSAddr *)1); // For debug logs: dstaddr = 0 means TCP; dstaddr = 1 means TLS if (TLS) dstaddr = mDNSNULL; … if ((unsigned)(end - (mDNSu8 *)pkt) < sizeof(DNSMessageHeader)) { LogMsg(“DNS Message too short”); return; } QR_OP = (mDNSu8)(msg->h.flags.b[0] & kDNSFlag0_QROP_Mask); // Read the integer parts which are in IETF byte-order (MSB first, LSB second) ptr = (mDNSu8 *)&msg->h.numQuestions; msg->h.numQuestions = (mDNSu16)((mDNSu16)ptr[0] << 8 | ptr[1]); msg->h.numAnswers = (mDNSu16)((mDNSu16)ptr[2] << 8 | ptr[3]); msg->h.numAuthorities = (mDNSu16)((mDNSu16)ptr[4] << 8 | ptr[5]); msg->h.numAdditionals = (mDNSu16)((mDNSu16)ptr[6] << 8 | ptr[7]); if (!m) { LogMsg(“mDNSCoreReceive ERROR m is NULL”); return; } // We use zero addresses and all-ones addresses at various places in the code to indicate special values like “no address” // If we accept and try to process a packet with zero or all- ones source address, that could really mess things up if (srcaddr && !mDNSAddressIsValid(srcaddr)) { debugf(“mDNSCoreReceive ignoring packet from %#a”, srcaddr); return; } mDNS_Lock(m); m->PktNum++; … if (QR_OP == StdQ) mDNSCoreReceiveQuery (m, msg, end, srcaddr, srcport, dstaddr, dstport, ifid); else if (QR_OP == StdR) mDNSCoreReceiveResponse(m, msg, end, srcaddr, srcport, dstaddr, dstport, ifid); else if (QR_OP != UpdR) { LogMsg(“Unknown DNS packet type %02X%02X from %#-15a:%-5d to %#-15a:%-5d on %p (ignored)”, msg->h.flags.b[0], msg->h.flags.b[1], srcaddr, mDNSVal16(srcport), dstaddr, mDNSVal16(dstport), InterfaceID); } // Packet reception often causes a change to the task list: // 1. Inbound queries can cause us to need to send responses 46 Part I ■ Mac OS X Basics // 2. Conflicing response packets received from other hosts can cause us to need to send defensive responses // 3. Other hosts announcing deletion of shared records can cause us to need to re-assert those records // 4. Response packets that answer questions may cause our client to issue new questions mDNS_Unlock(m); } The raw data from the network enters this function in the pkt variable. It then uses msg as a pointer to a structure that understands the format of the packet. (gdb) print *((DNSMessage *) pkt) $2 = { h = { id = { b = “\000”, NotAnInteger = 0 }, flags = { b = “\000”, NotAnInteger = 0 }, numQuestions = 768, numAnswers = 0, numAuthorities = 0, numAdditionals = 0 }, data = “\bDVR 887A\f_tivo-videos\004_tcp\005local\000\000!\000 \001?\f\000\020\000\001\bDVR-5C90?’\000\001\000\001prisoner\004iana \003org\000\nhostmaster\froot-servers?T\000\000\000\001\000\000\a\ b\000\000\003?\000\t:?\000\t:?Command=QueryContainer&Container=%2FNowPla ying\030swversion=9.3.1-01-2-649\024platf”… } Now back to the source code. typedef packedstruct { mDNSOpaque16 id; mDNSOpaque16 flags; mDNSu16 numQuestions; mDNSu16 numAnswers; mDNSu16 numAuthorities; mDNSu16 numAdditionals; Chapter 2 ■ Mac OS X Parlance 47 } DNSMessageHeader; // We can send and receive packets up to 9000 bytes (Ethernet Jumbo Frame size, if that ever becomes widely used) // However, in the normal case we try to limit packets to 1500 bytes so that we don’t get IP fragmentation on standard Ethernet // 40 (IPv6 header) + 8 (UDP header) + 12 (DNS message header) + 1440 (DNS message body) = 1500 total #define AbsoluteMaxDNSMessageData 8940 #define NormalMaxDNSMessageData 1440 typedef packedstruct { DNSMessageHeader h; // Note: Size 12 bytes mDNSu8 data[AbsoluteMaxDNSMessageData]; // 40 (IPv6) + 8 (UDP) + 12 (DNS header) + 8940 (data) = 9000 } DNSMessage; It reverses the byte order (endianness) and, depending on the type of packet, calls either mDNSCoreReceiveQuery or mDNSCoreReceiveResponse. These two functions break out the data further and process it. The entire code is large, but this shows one place where outside data enters the system. Another spot that code enters mDNSResponder is in the fi le LegacyNATTransversal.c. Any fi le or function in source code containing the word legacy always requires a second look by a code auditor. QuickTime QuickTime Player plays a large variety of different fi le types. Some are well known (like .mp3, .avi, and .gif ) and most common audio- and video-player software can understand them. QuickTime Player also plays a number of Apple- developed fi le formats that many other players may not support. QuickTime Player communicates to servers using a few protocols that are not common. In this section we’ll outline some of the fi le types and protocols that were originally introduced for QuickTime Player. .mov The QuickTime fi le format (.mov) was designed by Apple and is now the basis for MPEG-4. It consists of containers that store one or more tracks. Each track can store a different type of data, such as audio, video, or text. The fundamental unit for a .mov fi le is the atom. An atom begins with a 32-bit unsigned integer, followed by a 32-bit type. The rest of the atom is the data for that atom. This data may contain other atoms; see Figure 2-7. 48 Part I ■ Mac OS X Basics Figure 2-7: The atom structure of a .mov file The size value indicates the total number of bytes in the atom, and the type usually consists of four bytes from the ASCII range of values. The size value can also be an extended size, which allows for sizes larger than 32 bits. In the case of extended size, the size fi eld is set to 1 (which would not normally be valid since the size fi eld contains the number of bytes in the whole atom, including the size fi eld itself and the type fi eld). When an extended size is needed, the 64 bits after the type are used for the size. Finally, if the size value is set to zero, the atom is assumed to extend for the rest of the fi le so that the size is the length of the fi le from that point onward. Take a look at the atom structure for an actual fi le. $ hexdump -C L33t_Haxxors.mov | head 00 00 00 20 66 74 79 70 71 74 20 20 20 05 03 00 |... ftypqt ...| 71 74 20 20 00 00 00 00 00 00 00 00 00 00 00 00 |qt ............| 00 01 16 3b 6d 6f 6f 76 00 00 00 6c 6d 76 68 64 |...;moov...lmvhd| 00 00 00 00 c2 24 a3 f9 c2 24 a3 fb 00 00 02 58 |....?$???$??...X| 00 01 64 49 00 01 00 00 01 00 00 00 00 00 00 00 |..dI............| 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 |................| 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 |................| 00 00 00 00 40 00 00 00 00 00 00 00 00 00 04 b0 |....@..........?| 00 00 07 08 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00 00 00 09 00 00 03 17 74 72 61 6b 00 00 00 5c |........trak...\| 74 6b 68 64 00 00 00 0f c1 f2 72 0e c2 24 a3 fb |tkhd....??r.?$??| The fi rst atom begins with a length of 0×20 and a type of ftyp. Referring to the specifi cation, this type corresponds to the fi le type Atom. The data in this par- ticular type of atom is the Major_Brand, a 32-bit integer, the Minor_Version, and a series of Compatible_Brands. The next atom, beginning at offset 0×20 in the fi le, has size 0×1163b and is of type moov, or a Movie Atom. The Movie Atom is large and can contain many different types of atoms. In this case, the fi rst thing that Chapter 2 ■ Mac OS X Parlance 49 shows up in the data is a Movie Header Atom with size 0x6c and type mvhd. See Figure 2-8 for more data broken out by type. Figure 2-8: The .mov file broken out by atom. All sizes are in hexadecimal. Being familiar with the layout of the fi les will help in fuzzing or auditing the QuickTime Player application. We’ll discuss reverse engineering and fuzzing in chapters 5 and 6, but to see how knowing the fi le format helps in reverse- engineering the player, fi rst fi nd the library responsible for parsing .mov fi les. You can do this by fi nding the libraries used by QuickTime Player and then searching through the strings in each library for the names of the atom types. $ otool -L QuickTime\ Player QuickTime Player: /System/Library/Frameworks/AppKit.framework/Versions/C/AppKit (compatibility version 45.0.0, current version 949.0.0) /System/Library/Frameworks/ApplicationServices.framework/Versions/A/ ApplicationServices (compatibility version 1.0.0, current version 34.0.0) /System/Library/Frameworks/Carbon.framework/Versions/A/Carbon (compatibility version 2.0.0, current version 136.0.0) /System/Library/Frameworks/CoreFoundation.framework/Versions/A/ CoreFoundation (compatibility version 150.0.0, current version 476.0.0) 50 Part I ■ Mac OS X Basics /System/Library/Frameworks/Foundation.framework/Versions/C/Foundation (compatibility version 300.0.0, current version 677.0.0) /System/Library/Frameworks/IOKit.framework/Versions/A/IOKit (compatibility version 1.0.0, current version 275.0.0) /System/Library/Frameworks/QTKit.framework/Versions/A/QTKit (compatibility version 1.0.0, current version 1.0.0) /System/Library/Frameworks/QuickTime.framework/Versions/A/QuickTime (compatibility version 1.0.0, current version 861.0.0) /System/Library/Frameworks/Security.framework/Versions/A/Security (compatibility version 1.0.0, current version 31122.0.0) /System/Library/Frameworks/SystemConfiguration.framework/Versions/A/ SystemConfiguration (compatibility version 1.0.0, current version 204.0.0) /System/Library/Frameworks/Quartz.framework/Versions/A/Quartz (compatibility version 1.0.0, current version 1.0.0) /System/Library/Frameworks/QuartzCore.framework/Versions/A/QuartzCore (compatibility version 1.2.0, current version 1.5.0) /usr/lib/libstdc++.6.dylib (compatibility version 7.0.0, current version 7.4.0) /usr/lib/libgcc_s.1.dylib (compatibility version 1.0.0, current version 1.0.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 111.0.0) /System/Library/Frameworks/CoreServices.framework/Versions/A/ CoreServices (compatibility version 1.0.0, current version 32.0.0) /usr/lib/libobjc.A.dylib (compatibility version 1.0.0, current version 227.0.0) $ otool -L QuickTime\ Player| xargs grep “moov” 2> /dev/null Binary file /System/Library/Frameworks/QTKit.framework/Versions/A/QTKit matches Binary file /System/Library/Frameworks/QuickTime.framework/Versions/A/ QuickTime matches The second library in the list seems the most promising, so grab it and load it into IDA Pro. Search for one of the unsigned integers that represents an atom type—for example, “moov” = 0x6d6f6f76. You can do this by selecting Search and typing in your search term. There will be many occurrences of this; see Figure 2-9. Using this method, you can fi nd the functions that are parsing for the atom type. This allows you to fi nd the relevant parsing code quickly, even in the middle of complicated functions; see Figure 2-10. Reading through the specifi cation, you can choose a more obscure atom type such as the Preview atom, “rmda” = 0x706e6f74. Here only three func- tions use this value: _NewMovieFromDataRefPriv_priv, _AddFilePreview, and _MakeFilePreview; see Figure 2-11. Chapter 2 ■ Mac OS X Parlance 51 Figure 2-9: There are many comparisons against the string “moov” in the QuickTime library. Figure 2-10: A complicated function responsible for checking atom types found with grep 52 Part I ■ Mac OS X Basics Figure 2-11: There are only three occurrences of “rmda” in the QuickTime library. Using even this very basic technique can allow you to focus quickly on the portions of code associated with particular atom types. There are other Apple-created fi le types, such as QuickTime Media Link (.qtl) and QuickTime Virtual Reality (.qtvr), that QuickTime Player can process by default. You must understand these, along with all the non-Apple fi le formats, to evaluate the security of client-side applications on a Mac OS X computer. We’ll discuss this more in the next chapter. RTSP Besides fi le formats, QuickTime Player uses some uncommon network protocols. To get video on demand, it uses the Real Time Streaming Protocol (RTSP) to access metafi le information and issue streaming commands. It uses the Real- time Transport Protocol (RTP) for the actual video and audio content. These protocols have been a source of vulnerabilities in the past; see CVE-2007-6166 and CVE-2008-0234 for specifi c instances of RTSP vulnerabilities.. RTSP is similar in design to HTTP, with the biggest difference being that RTSP has a session identifi er that allows for stateful transactions. Different RTSP requests can be linked together by combining the session identifi er with the request. By contrast, HTTP is stateless, meaning each individual HTTP request is independent of all previous (and future) requests. RTSP may be transmitted over TCP or UDP. While TCP and UDP differ in their underlying delivery mechanism, the RTSP application protocol is still considered stateful due to the inclusion of the session identifi er. Figure 2-12 shows a typical RTSP session. Possible RTSP methods include OPTIONS: Get available methods ■ SETUP: Initialize session ■ ANNOUNCE: Change description of media object ■ DESCRIBE: Get description of media object ■ PLAY: Start playback ■ 54 Part I ■ Mac OS X Basics Look at the RTSP protocol in action. First you need an RTSP server. For this you can either use the QuickTime Streaming Server that comes on Mac OS X Server or the Darwin Streaming Server, which is open source. The Darwin server can be obtained from http://dss.macosforge.org/. The binary pack- age comes in a .dmg fi le that will launch automatically and take you to the web-server interface on port 1220. The default location for media content is /Library/QuickTimeStreaming/Movies/. Figure 2-13 shows the administra- tive interface. Figure 2-13: The administrative interface for the QuickTime Streaming Server To have some content available for download, select Playlists ➢ New Media Playlist. Add a fi le to the playlist, like the fi le sample_100kbit.mov that comes with the Darwin server. Name the playlist test. Then press the play button on the Playlist page for the new test playlist; see Figure 2-14. You can now use QuickTime Player to connect to the media server by launch- ing QuickTime Player and selecting File ➢ Open URL and entering rtsp://localhost/test.sdp The movie should play in the viewer. Capturing the packets shows how the exchange proceeds from RTSP to RTP; see Figure 2-15. Chapter 2 ■ Mac OS X Parlance 55 Figure 2-14: The server is now streaming live media. Figure 2-15: A packet capture that shows the transition from RTSP to RTP 56 Part I ■ Mac OS X Basics Looking at the RTSP that was exchanged, we see the fi rst leg of the conversa- tion started by the player issuing the following request: DESCRIBE rtsp://192.168.1.182/test.sdp RTSP/1.0 CSeq: 1 Accept: application/sdp Bandwidth: 384000 Accept-Language: en-US User-Agent: QuickTime/7.4.1 (qtver=7.4.1;cpu=IA32;os=Mac 10.5.2) Notice the sequence number 1. The server responds with the contents of the .sdp playlist fi le requested. These .sdp fi les are another fi le format that lies on the attack surface of QuickTime Player. RTSP/1.0 200 OK Server: QTSS/6.0.3 (Build/526.3; Platform/MacOSX; Release/Darwin Streaming Server; State/Development; ) Cseq: 1 Cache-Control: no-cache Content-length: 386 Date: Wed, 09 Jul 2008 15:19:11 GMT Expires: Wed, 09 Jul 2008 15:19:11 GMT Content-Type: application/sdp x-Accept-Retransmit: our-retransmit x-Accept-Dynamic-Rate: 1 Content-Base: rtsp://192.168.1.182/test.sdp/ v=0 o=QTSS_Play_List 140087043 422545485 IN IP4 192.168.1.182 s=test c=IN IP4 0.0.0.0 b=AS:94 t=0 0 a=x-broadcastcontrol:RTSP a=control:* m=video 0 RTP/AVP 96 b=AS:79 a=3GPP-Adaptation-Support:1 a=rtpmap:96 X-SV3V-ES/90000 a=control:trackID=1 m=audio 0 RTP/AVP 97 b=AS:14 a=3GPP-Adaptation-Support:1 a=rtpmap:97 X-QDM/22050/2 a=control:trackID=2 a=x-bufferdelay:4.97 Chapter 2 ■ Mac OS X Parlance 57 Next the client attempts to set up for the fi rst track. SETUP rtsp://192.168.1.182/test.sdp/trackID=1 RTSP/1.0 CSeq: 2 Transport: RTP/AVP;unicast;client_port=6970-6971 x-retransmit: our-retransmit x-dynamic-rate: 1 x-transport-options: late-tolerance=2.384000 User-Agent: QuickTime/7.4.1 (qtver=7.4.1;cpu=IA32;os=Mac 10.5.2) Accept-Language: en-US After some negotiations back and forth where the server issues OPTIONS headers, the server fi nally responds with an OK and lists all of the necessary parameters, such as port numbers and session IDs. RTSP/1.0 200 OK Server: QTSS/6.0.3 (Build/526.3; Platform/MacOSX; Release/Darwin Streaming Server; State/Development; ) Cseq: 3 Session: 2239848818749704366 Cache-Control: no-cache Date: Wed, 09 Jul 2008 15:19:11 GMT Expires: Wed, 09 Jul 2008 15:19:11 GMT Transport: RTP/AVP;unicast;source=192.168.1.182;client_port=6972- 6973;server_port=6970-6971 x-Transport-Options: late-tolerance=2.384000 x-Retransmit: our-retransmit x-Dynamic-Rate: 1 The client can now begin playing the media. PLAY rtsp://192.168.1.182/test.sdp RTSP/1.0 CSeq: 4 Range: npt=0.000000- x-prebuffer: maxtime=2.000000 x-transport-options: late-tolerance=10 Session: 2239848818749704366 User-Agent: QuickTime/7.4.1 (qtver=7.4.1;cpu=IA32;os=Mac 10.5.2) At this point, the media server begins streaming the actual contents of the media to the client via RTP over UDP. The client can control this by using Real- time Transport Control Protocol (RTCP). After the viewer fi nishes watching the media, they may choose to pause or tear down the connection. Below is the back-and-forth between client and server. PAUSE rtsp://192.168.1.182/test.sdp RTSP/1.0 CSeq: 6 58 Part I ■ Mac OS X Basics Session: 2239848818749704366 User-Agent: QuickTime/7.4.1 (qtver=7.4.1;cpu=IA32;os=Mac 10.5.2) RTSP/1.0 200 OK Server: QTSS/6.0.3 (Build/526.3; Platform/MacOSX; Release/Darwin Streaming Server; State/Development; ) Cseq: 6 Session: 2239848818749704366 TEARDOWN rtsp://192.168.1.182/test.sdp RTSP/1.0 CSeq: 7 Session: 2239848818749704366 User-Agent: QuickTime/7.4.1 (qtver=7.4.1;cpu=IA32;os=Mac 10.5.2) RTSP/1.0 200 OK Server: QTSS/6.0.3 (Build/526.3; Platform/MacOSX; Release/Darwin Streaming Server; State/Development; ) Cseq: 7 Session: 2239848818749704366 Connection: Close With the history of vulnerabilities in the handling of RTSP, it’s worth your time to become familiar with this protocol. Your knowledge can be leveraged for fuzzing or reverse engineering. As we did for .mov fi les, let’s use our knowledge of the protocol to fi nd some important parts of the QuickTime binaries. First we must fi nd the library (or application) that contains the RTSP parsing code. For this, select something from the protocol you wouldn’t expect to see anywhere else—for example, the term TEARDOWN. Trying to grep for this word in the libraries that QuickTime Player is linked to, as we did before, fails. $ otool -L QuickTime\ Player| xargs grep TEARDOWN 2> /dev/null $ This is because QuickTime Player loads many libraries dynamically at runtime, including the so-called QuickTime Components. Attaching to a running QuickTime Player with GDB and issuing the info sharedlibrary command reveals more of the libraries QuickTime actually uses (others are loaded on demand). (gdb) info sharedlibrary The DYLD shared library state has not yet been initialized. Requested State Current State Num Basename Type Address Reason | | Source | | | | | | | | 1 QuickTime Player - 0x1000 exec Y Y /Applications/QuickTime Player.app/Contents/MacOS/QuickTime Player (offset 0x0) Chapter 2 ■ Mac OS X Parlance 59 2 dyld - 0x8fe00000 dyld Y Y /usr/lib/dyld at 0x8fe00000 (offset 0x0) with prefix “__dyld_” 3 AppKit F 0x95255000 dyld Y Y /System/Library/Frameworks/AppKit.framework/Versions/C/AppKit at 0x95255000 (offset -0x6adab000) 4 ApplicationServices F 0x904ac000 dyld Y Y /System/Library/Frameworks/ApplicationServices.framework/Versions/A/ ApplicationServices at 0x904ac000 (offset -0x6fb54000) 5 Carbon F 0x90f06000 dyld Y Y /System/Library/Frameworks/Carbon.framework/Versions/A/Carbon at 0x90f06000 (offset -0x6f0fa000) … 126 ApplePixletVideo - 0x173fa000 dyld Y Y /System/Library/QuickTime/ApplePixletVideo.component/Contents/MacOS/ ApplePixletVideo at 0x173fa000 (offset 0x173fa000) 127 RawCamera B 0x175d9000 dyld Y Y /System/Library/CoreServices/RawCamera.bundle/Contents/MacOS/RawCamera at 0x175d9000 (offset 0x175d9000) 128 QuickTimeImporters - 0x96120000 dyld Y Y /System/Library/QuickTime/QuickTimeImporters.component/Contents/MacOS/ QuickTimeImporters at 0x96120000 (offset -0x69ee0000) 129 Unicode Encodings B 0x155ce000 dyld Y Y /System/Library/TextEncodings/Unicode Encodings.bundle/Contents/MacOS/ Unicode Encodings at 0x155ce000 (offset 0x155ce000) In this case there are 129 libraries loaded within the QuickTime process! The RTSP code could be located in any one of them (or any combination of them). Using your knowledge of the protocol, you can easily fi nd at least one that contains some RTSP processing code: $ find -X /System/Library/ -type f 2>/dev/null | grep ‘Contents/MacOS’ | xargs grep TEARDOWN 2> /dev/null Binary file /System/Library//QuickTime/QuickTimeStreaming.component/Contents/MacOS /QuickTimeStreaming matches This could have been done with a simple grep, but the preceding command executes faster. Firing up IDA Pro and loading this library quickly reveals por- tions of the executable that deal with RTSP. Following the cross-references (DATA and CODE) from the string “TEARDOWN” leads to the call chain in Figure 2-17. The QuickTime vulnerability (CVE-2007-6166) in the RTSP Content-Type handling took place in a memory copy within the EngineNotifi cationProc. Therefore, by knowing only a little about the protocol, it is possible to zero in on the portions of the binary that process the protocol. There will be more on exploiting this particular RTSP bug in Chapter 10, “Real-World Exploits,” and more on reverse engineering in Chapter 6, “Reverse Engineering.” Chapter 2 ■ Mac OS X Parlance 61 Conclusion Mac OS X uses a variety of Internet protocols and fi le formats. Most of these are the same as you would fi nd in a Windows, Linux, or Solaris environment. Nevertheless, Mac OS X does use a few Apple-developed or not-very-common protocols and fi le formats. This chapter looked at a few of these, including Bonjour, the QuickTime fi le format, and RTSP. It then showed how knowing the protocol or fi le format can help you fi nd which libraries are utilized by Mac OS X to process those protocols. References http://zeroconf.org http://www.multicastdns.org/ http://files.multicastdns.org/draft-cheshire-dnsext-multi- castdns.txt http://www.mactech.com/articles/mactech/Vol.21/21.11/ AutomaticServiceDirectory/index.html http://www.phrack.org/issues.html?issue=64&id=11 http://www.dns-sd.org/ http://tools.ietf.org/html/rfc2326 http://sourceforge.net/projects/pyzeroconf http://developer.apple.com/documentation/QuickTime/QTFF/ qtff.pdf http://www.cs.columbia.edu/~hgs/teaching/ais/slides/2003/ RTSP.pdf http://projects.info-pull.com/moab/MOAB-01-01-2007.html http://www.us-cert.gov/cas/techalerts/TA07-334A.html http://aluigi.altervista.org/adv/quicktimebof-adv.txt http://bardissi.wordpress.com/2008/01/11/zero-day-rtsp-hole- menaces-quicktime-again/ http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2007-6166 64 Part I ■ Mac OS X Basics also the drivers associated with Bluetooth and the wireless card. The associ- ated code was all written by Apple, so perhaps there are vulnerabilities to fi nd in it. Recall the big 2006 scandal in which David Maynor and Johnny “Cache” Ellch allegedly found some bugs in the MacBook wireless drivers that allowed them to take over any MacBook remotely. While the validity of this story was never confi rmed, the best thing about attacking at these lowest levels is that if it works, you automatically get root. Since not everyone is into kernel-level bugs and exploits, the more obvious place to look is at the applications that run in Mac OS X. In other words, look for the open TCP and UDP ports and determine what applications are associated with them. Out of the box, not many things are exposed to remote attackers. The command in the following code snippet will list the processes that are listening by default. $ sudo lsof -P | grep IPv | grep -v localhost ntpd 14 root 20u IPv4 0t0 UDP *:123 ntpd 14 root 21u IPv6 0t0 UDP *:123 ntpd 14 root 26u IPv4 0t0 UDP 192.168.1.4:123 mDNSRespo 21 _mdnsresponder 7u IPv4 0t0 UDP *:5353 mDNSRespo 21 _mdnsresponder 8u IPv6 0t0 UDP *:5353 configd 33 root 8u IPv4 0t0 UDP *:* configd 33 root 11u IPv6 0t0 ICMPV6 *:* SystemUIS 87 cmiller 9u IPv4 0t0 UDP *:* cupsd 601 root 9u IPv4 0t0 UDP *:631 By examining the output, you can observe there are no open TCP ports. There are three open UDP ports, however, which have ntpd, mDNSResponder, and cupsd listening, respectively. Confi gd and SystemUIServer are not bound to any particular port. The Network Time Protocol daemon, ntpd, is a well-known open-source server. cupsd is the daemon responsible for printing on many UNIX systems. It too is a well-known open-source server; however, the Common Unix Printing System (CUPS) has a long history of security bugs. Looking closer at the lsof output in the code example shows that cupsd is listening only on the external interface on UDP port 631. This implies that only a small subset of the functionality of CUPS is exposed by default (for instance, the administrative web interface is not accessible). The remaining service, mDNSResponder, is the only one of the three that is written by Apple and not widely used. Because mDNSResponder is the only Apple-written daemon that processes packets out of the box, the previous chapter looked briefl y at the protocol used by it, as well as some of the source code from it. Apple is committed to having Bonjour running out of the box on their systems, but they have done what they can to mini- mize the resulting exposure. First, Bonjour doesn’t run as root, but rather as the unprivileged _mdnsresponder user. Even more critically, though, this program is run within a tightly controlled sandbox. ntpd is also run in a sandbox. (Curiously, cupsd is not.) The following is the sandbox fi le for mDNSResponder. Chapter 3 ■ Attack Surface 65 (version 1) ; WARNING: The sandbox rule capabilities and syntax used in this file are currently an ; Apple SPI (System Private Interface) and are subject to change at any time without notice. ; Apple may in future announce an official public supported sandbox API, but until then Developers ; are cautioned not to build products that use or depend on the sandbox facilities illustrated here. ; Use “debug all” to log all operations examined by seatbelt, whether allowed or not. ; Use “debug deny” to log only operations that are denied by seatbelt ; to discover what specific attempted operation is causing an exception. ;(debug all) (debug deny) ; To help debugging, “with send-signal SIGFPE” will trigger a fake floating-point exception, ; which will crash the process and show the call stack leading to the offending operation. ; For the shipping version “deny” is probably better because it vetoes the operation ; without killing the process. (deny default) ;(deny default (with send-signal SIGFPE)) ; Special exception: “send-signal” command does not apply to the mach-* operations, ; so for those we have to use a plain unadorned “deny” instead ; (which means we may not get any notification of unintentional mach-* denials) (deny mach-lookup) (deny mach-priv-host-port) ; Mach communications ; These are needed for things like getpwnam, hostname changes, & keychain (allow mach-lookup (global-name “com.apple.bsd.dirhelper” “com.apple.distributed_notifications.2” “com.apple.ocspd” “com.apple.mDNSResponderHelper” “com.apple.SecurityServer” “com.apple.SystemConfiguration.configd” “com.apple.system.DirectoryService.libinfo_v1” “com.apple.system.notification_center”)) 66 Part I ■ Mac OS X Basics ; Rules to allow the operations mDNSResponder needs start here (allow network*) ; Allow networking, including Unix Domain Sockets (allow sysctl-read) ; To get hardware model information (allow file-read-metadata) ; Needed for dyld to work (allow ipc-posix-shm) ; Needed for POSIX shared memory (allow file-read-data (regex “^/dev/random\$”)) (allow file-read-data file-write-data (regex “^/dev/console\$”)) ; Needed for syslog early in the boot process (allow file-read-data (regex “^/dev/autofs_nowait\$”)) ; Used by CF to circumvent automount triggers ; Allow us to read and write our socket (allow file-read* file-write* (regex “^/private/var/run/mDNSResponder\$”)) ; Allow us to read system version, settings, and other miscellaneous necessary file system accesses (allow file-read-data (regex “^/usr/sbin(/mDNSResponder)?\$”)) ; Needed for CFCopyVersionDictionary() (allow file-read-data (regex “^/usr/share/icu/.*\$”)) (allow file-read-data (regex “^/usr/share/zoneinfo/.*\$”)) (allow file-read-data (regex “^/System/Library/CoreServices/SystemVersion.*\$”)) (allow file-read-data (regex “^/Library/Preferences/SystemConfiguration/preferences\.plist\$”)) (allow file-read-data (regex “^/Library/Preferences/(ByHost/)?\.GlobalPreferences.*\.plist\$”)) (allow file-read-data (regex “^/Library/Preferences/com\.apple\.security.*\.plist\$”)) (allow file-read-data (regex “^/Library/Preferences/com\.apple\.crypto\.plist\$”)) (allow file-read-data (regex “^/Library/Security/Trust Settings/Admin\.plist\$”)) (allow file-read-data (regex “^/System/Library/Preferences/com\.apple\.security.*\.plist\$”)) (allow file-read-data (regex “^/System/Library/Preferences/com\.apple\.crypto\.plist\$”)) ; Allow access to System Keychain (allow file-read-data (regex “^/System/Library/Security\$”)) (allow file-read-data (regex “^/System/Library/Keychains/.*\$”)) (allow file-read-data (regex Chapter 3 ■ Attack Surface 67 “^/Library/Keychains/System\.keychain\$”)) ; Our Module Directory Services cache (allow file-read-data (regex “^/private/var/tmp/mds/”)) (allow file-read* file-write* (regex “^/private/var/tmp/mds/[0- 9]+(/|\$)”)) This code uses a deny-by-default policy. It does allow arbitrary network con- nections to and from the application. The main restriction is that it carefully controls which fi les can be read and written. Therefore, even if you could run arbitrary code within the application, you couldn’t do many interesting things. A similar sandbox exists for ntpd. These sandboxes (if implemented correctly) effectively remove these applications from consideration by an attacker, or at the very least, make exploitation much more challenging. There is one caveat to the sandboxes. The sandbox prevents the program in the sandbox and any of its children from doing anything interesting. It does not prevent them from passing data to applications that are not in a sandbox. This is one way it might be possible to escape from such a sandbox. Consider the following scenario. A system advertises, via the Bonjour protocol, that a new printer is available on the network. mDNSResponder notifi es CUPS (not in a sandbox) to add the printer. If there is a vulnerability in the way CUPS adds printers, you have just gotten access to a nonsandboxed application through the mDNSResponder sandbox! Taking all of this into consideration, if you’re looking for a server-side attack against a stock install of Mac OS X, your best bet is probably something like wireless drivers or a UDP-only attack against CUPS. Before we conclude this discussion, please note that sometimes client pro- grams open up ports which then become susceptible to remote attack, even if the user doesn’t connect to the attacker. iTunes is an example of this. When iTunes is launched, it listens on port 3689 (DAAP). This is the port iTunes uses for sharing music fi les. The interesting thing is that iTunes opens and listens on this port even if it is not confi gured for sharing music. The difference between music sharing being on and being off is that when it is off, iTunes doesn’t do much on that port. The following shows that with music sharing disabled, but iTunes running, it still listens on a port. $ lsof -P | grep iTunes | grep LISTEN iTunes 7662 cmiller 17u IPv4 0x5e0da68 0t0 TCP *:3689 (LISTEN) However, the following is an exchange between a DAAP client and this port when music sharing is off. GET /server-info HTTP/1.1 TE: deflate,gzip;q=0.3 68 Part I ■ Mac OS X Basics Keep-Alive: 300 Connection: Keep-Alive, TE Host: localhost:3689 User-Agent: libwww-perl/5.813 HTTP/1.1 501 Not Implemented Date: Thu, 28 Aug 2008 01:39:15 GMT DAAP-Server: iTunes/7.7.1 (Mac OS X) Content-Type: application/x-dmap-tagged Content-Length: 0 In this case, iTunes returns a 501 error regardless of the input. However, it still offers the possibility for an attacker to have the Mac remotely process some data that relies only on the user having the iTunes process running. Nonstandard Listening Processes By accessing the Sharing pane in the System Preferences, users often turn on other services; see Figure 3-1. Figure 3-1: The Sharing pane indicates which services are running. The fi rst option listed is DVD or CD Sharing. This option shares out the user’s DVD or CD drive to the subnet. This service is advertised using Bonjour and resides on some randomly chosen port. $ dns-sd -B _odisk._tcp Browsing for _odisk._tcp Chapter 3 ■ Attack Surface 69 Timestamp A/R Flags if Domain Service Type Instance Name 20:37:29.601 Add 3 9 local. _odisk._tcp. Charlie Miller’s Computer In this case, a look at netstat reveals that a new port has opened on 63378. Following up with lsof, we can see what application has been spawned by acti- vating this option in the Sharing pane. $ sudo lsof | grep 53358 ODSAgent 40560 root 3u IPv6 0x3e78984 0t0 TCP *:53358 (LISTEN) It is /System/Library/CoreServices/ODSAgent.app. This program basically uses an HTTP-based protocol, but it does some authentication; see Figure 3-2. Figure 3-2: The data from a packet capture of a remote disk being authenticated The client grabs what appears to be a .dmg or .iso image, whose name was provided by the server in the initial response. Within the data, you can see things like names of directories and fi les; see Figure 3-3. The next item from the Sharing pane is Screen Sharing. This simply opens a VNC server on port 5900 and a Kerberos server on port 88. The Kerberos server is the standard krb5kdc application and is opened by the operating system the fi rst time it is needed. The VNC server is AppleVNCS. If you notice this running on a Mac, you may want to look for bugs in it. Next is the File Sharing option. This opens a server on port 548 (afpovertcp). Looking at lsof, you see that launchd is listening on that port. That doesn’t tell you much, though, because like inetd/xinetd, launchd hands off inbound con- nections to another application. 70 Part I ■ Mac OS X Basics Figure 3-3: A disk image is retrieved. To see what will be launched, look in the LaunchDaemons directory for confi guration fi les containing the afpovertcp port. $ cd /System/Library/LaunchDaemons/ $ grep -h -B 11 afpovertcp * ProgramArguments/usr/sbin/AppleFileServerSocketsListenerBonjourSockServiceNameafpovertcp You see that AppleFileServer is the application that will be launched. $ /usr/sbin/AppleFileServer -v afpserver-530.8.3 AppleFileServer speaks Apple Filing Protocol (AFP), which functions much like the Network File System (NFS) protocol used by many UNIX systems, or the Server Message Block (SMB)/Common Internet File System (CIFS) used by Windows systems. Chapter 3 ■ Attack Surface 71 AppleFileServer has had bugs in the past (http://xforce.iss.net/xforce/ xfdb/16049) and probably has more bugs. If you fi nd it running on a target computer, take a closer look. The next check box is Printer Sharing, which opens many ports. > launchd 1 root 56u IPv6 0t0 TCP *:515 (LISTEN) > launchd 1 root 61u IPv4 0t0 TCP *:515 (LISTEN) > launchd 1 root 93u IPv4 0t0 TCP *:139 (LISTEN) > launchd 1 root 94u IPv4 0t0 TCP *:445 (LISTEN) 8a13,16 > cupsd 45270 root 7u IPv6 0t0 TCP localhost:631 (LISTEN) > cupsd 45270 root 8u IPv4 0t0 TCP localhost:631 (LISTEN) > cupsd 45270 root 10u IPv6 0t0 TCP *:631 (LISTEN) > cupsd 45270 root 13u IPv4 0t0 TCP *:631 (LISTEN) Launchd will launch /usr/libexec/cups/daemon/cups-lpd on port 515 (printer, and /user/sbin/smbd (netbios-ssn 139, microsoft-ds 445). CUPS will now listen on the external interface. If the client is sharing a printer, the avail- able attack surface becomes quite large. The Web Sharing check box enables a standard Apache service on port 80. The webroot for this installation is at /Library/WebServer/Documents and the CGIs are in /Library/WebServer/CGI-Executables. By default, the CGI directory is empty, so no help there for an attacker. The Remote Login option is a standard OpenSSH handled by launchd. The binary is at /usr/sbin/sshd. As of the writing of this book, the version string is OpenSSH_4.7p1, OpenSSL 0.9.7l 28 Sep 2006. The fi nal option we’ll discuss is Remote Apple Events. There are a few other options available in the Sharing pane, but they are relatively obscure or benign. Remote Apple Events enables the AEServer handled by launchd on port 3031 (eppc). This server allows remote users to run AppleScript programs on the computer running the AEServer. For example, on another computer, start the script editor (/Applications/AppleScript/Script Editor.app). Enter the following into the editor: set remoteMac to “eppc://user:password@MachineName.local” using terms from application “Finder” tell application “Finder” of machine B get name of every disk end end 72 Part I ■ Mac OS X Basics When that code is executed, it will return the names of the disks from the computer that is allowing remote Apple events. Note that this server does require authentication. That doesn’t mean there couldn’t be a pre-authentication bug, though! Cutting into the Client Side The attack surface when attacking Mac OS X clients is much larger than when restricting yourself to the server side. Any application that accesses the Internet is a potential target (as are many that don’t). Mac OS X is founded on the principle that things should be easy for the user; they should just work. For an attacker, this means the operating system is designed to handle a large number of formats and protocols automatically. For example, Safari will view just about any fi le format you can imagine. The key to determining the client-side attack surface is to understand exactly what types of fi les and protocols each applica- tion is willing to consume. And understanding that relies on understanding the relationship between the applications and the fi les they process. Each application has an Info.plist fi le that declares the known URL protocols, extensions, MIME types, and fi le types the application can handle. In Mac OS X, LaunchServices is responsible for determining what application is associ- ated with a given fi le type or extension. An application will get registered with LaunchServices whenever it is fi rst put on disk and its Info.plist fi le is processed. Note that, typically, downloading an application from the Internet will present the user with a warning, which prevents an attacker from automatically regis- tering application associations without the user’s knowledge. The prototypical client-side application is Safari, the default web browser in Mac OS X. Look at its Info.plist fi le, which you can fi nd at /Applications/Safari. app/Contents/Info.plist. What follows is the beginning of this fi le. Application-Groupdot-macCFBundleDevelopmentRegionEnglishCFBundleDocumentTypesCFBundleTypeExtensionscss Chapter 3 ■ Attack Surface 73 CFBundleTypeIconFiledocument.icnsCFBundleTypeMIMETypestext/cssCFBundleTypeNameCSS style sheetCFBundleTypeRoleViewerNSDocumentClassBrowserDocumentCFBundleTypeExtensionspdfCFBundleTypeIconFiledocument.icnsCFBundleTypeMIMETypesapplication/pdfCFBundleTypeNamePDF documentCFBundleTypeRoleViewerNSDocumentClassBrowserDocument The fi rst important key is CFBundleDocumentTypes. This indicates the types of documents supported by the bundle. In this case it is an array of such types. The fi rst is a CSS style sheet. This type of document has a fi le extension of .css and a MIME type of text/css. Based on the CFBundleTypeRole, Safari is regis- tered as a viewer of this type. The next entry in the array is a PDF document, for which Safari is also a viewer. The following list reveals what each key means in the CFBundleDocumentTypes array. CFBundleTypeExtensions: The fi le name extension for the fi le CFBundleTypeIconFile: The icon in the bundle that Finder should associate with the fi le type CFBundleTypeMIMETypes: The MIME type for the fi le CFBundleTypeName: The text that will be shown in Finder to describe the fi le 74 Part I ■ Mac OS X Basics CFBundleTypeRole: Specifi es whether the program can open (Viewer), open and save (Editor), or is simply a shell to another program LSIsAppleDefaultForType: Specifi es whether the bundle should be the default application for this type As we mentioned earlier, LaunchServices compiles all of this application information and stores it in a database. Querying this database, for example, determines what application is launched when a fi le is double-clicked in a Finder window. This database can be viewed by the lsregister program, as seen in the following output. $/System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks /LaunchServices.framework/Versions/A/Support/lsregister -dump Checking data integrity......done. Status: Database is seeded. … bundle id: 55728 path: /Applications/Safari.app name: Safari identifier: com.apple.Safari (0x80007605) canonical id: com.apple.safari (0x8000030f) version: 5525.20.1 mod date: 7/7/2008 8:57:33 reg date: 7/7/2008 9:03:34 type code: ‘APPL’ creator code: ‘sfri’ sys version: 10.5 flags: apple-internal relative-icon-path handles-file- url quarantined item flags: container package application extension-hidden native-app scriptable services ppc i386 icon: Contents/Resources/compass.icns executable: Contents/MacOS/Safari inode: 565157 exec inode: 8145048 container id: 32 library: library items: … -------------------------------------------------------- claim id: 29988 name: CSS style sheet rank: Default roles: Viewer flags: apple-internal relative-icon-path icon: Contents/Resources/document.icns bindings: .css, text/css -------------------------------------------------------- claim id: 30016 name: PDF document Chapter 3 ■ Attack Surface 75 rank: Default roles: Viewer flags: apple-internal relative-icon-path icon: Contents/Resources/document.icns bindings: .pdf, application/pdf -------------------------------------------------------- … The information from Info.plist is seen in the database. A graphical tool called RCDefaultApp (http://www.rubicode.com/Software/RCDefaultApp/) queries the LaunchServices database and presents the information in a more coherent form; see Figure 3-4. Figure 3-4: RCDefaultApp reveals that files with an atr extension are associated with QuickTime Player. In this fi gure, RCDefaultApp indicates that any fi le with the extension “.atr” will be opened by the QuickTime Player. This particular fi le format is not used very often and therefore the code may not be well tested. Such obscure fi le formats can be fertile grounds for fuzzing; see Chapter 5, “Finding Bugs.” RCDefaultApp can be used to fi nd the application for each fi le format that the operating system recognizes. Safari Safari is the most feature-rich web browser in existence. Features, of course, require code, and additional code increases the attack surface. In this section you will see how to determine all the functionality accessible to an attacker when a Safari web browser visits the attacker’s website. Safari handles a number of fi le formats and MIME types natively and has extensive support for fi le formats with built-in plug-ins. The LaunchServices 76 Part I ■ Mac OS X Basics database (derived from the Info.plist fi le and accessible via RCDefaultApp or from the Info.plist fi le directly) reveals the fi le types that are handled natively: $ cd/Applications/Safari.app/Contents $ grep -A3 CFBundleTypeExtensions Info.plist | grep string csspdfwebarchivesyndarticlewebbookmarkwebhistoryweblocdownloadgifhtmlhtmjsjpgjpegjp2txttextpngtifftifurlicoxhtmlxhtxmlxblsvg This list includes all fi le types handled remotely or locally, so they should be checked individually if you are looking for particular fi le types to attack remotely. For example, browsing to a “webarchive” fi le over the Internet will only download the fi le, not display it in Safari. Safari will natively render PDF, JPG, PNG, TIF, ICO, and SVG image formats. It also parses JavaScript, HTML, and XML. Of course, with the help of plug-ins, there are many more fi le types supported. The easiest way to view these fi le types is to go to Help ➢ Installed Plug-ins in Safari; see Figure 3-5. Figure 3-5 indicates that Safari handles .swf fi les with the Adobe Flash plug- in, which is installed by default. The QuickTime plug-in reveals an additional 59 fi le formats supported by Safari. It is hard to imagine a web browser that has no bugs when parsing more than 60 fi le formats. The Java plug-in represents yet another vector of attack through Safari. Chapter 3 ■ Attack Surface 77 Figure 3-5: The list of installed Safari plug-ins and their associated file types All of Safari’s Children In addition to the formats Safari handles through native code and multimedia plug-ins, it can spawn a large number of other applications through URL han- dlers. Consult RCDefaultApp for a complete list; see Figure 3-6. The number of possibilities is astounding. Want to launch the Dictionary. app program and look up the defi nition of attack surface? Just go to the URL dict://attack surface; see Figure 3-7. Although there isn’t a large variety of data that can be passed to this application, it was undoubtedly not designed to withstand malicious input. 78 Part I ■ Mac OS X Basics Figure 3-6: RCDefaultApp reveals all the programs that are associated with various URLs, in this case webcal:// Figure 3-7: The Dictionary.app program launched from within Safari Other interesting programs that can be launched include Address Book, iChat, iTunes, Help Viewer, iCal, Keynote, iPhoto, QuickTime Player, and, of course, Terminal and Finder. Sometimes the amount of data an attacker can input into these programs is very limited, but at the very least, simply by having a victim follow a link in Safari, it is possible to have the victim do the following: Open a VNC session via the Screen Sharing application ■ Start an SMB or AFP session via Finder ■ Start a DAAP or ITPC session with iTunes ■ Begin an RTSP session with QuickTime Player ■ Chapter 3 ■ Attack Surface 79 Besides being a way to launch other processes, the URL handlers themselves may have vulnerabilities. For example, iPhoto and iChat have been guilty of format-string vulnerabilities in the way they handle URLs. This means simply by enticing a user to click on a link, the attacker may take advantage of a bug in the way Safari natively handles HTML, JavaScript, a handful of image formats, anything QuickTime Player plays, or any bugs in a variety of other software on the system—including Finder and iTunes. There is a very large attack surface for Safari! Safe File Types One of the great things about Safari, from a usability (or attack) perspective, is that it will open many fi le types automatically. Many security warnings issued against Apple will contain the phrase “Turn off automatic opening of safe fi les,” but what exactly is a safe fi le and which fi le types are considered safe? The answer to this question can be found in the /System/Library/ CoreServices/CoreTypes.bundle/Contents/Resources/System fi le. This is an XML fi le that contains a list of fi le types (and MIME types and extensions) considered safe, neutral, or unsafe. The following is an excerpt from the begin- ning of this fi le. LSRiskCategorySafeLSRiskCategoryContentTypescom.adobe.encapsulated-postscript com.adobe.illustrator.ai-imagecom.adobe.pdfcom.adobe.photoshop-imagecom.adobe.postscriptcom.apple.dashboard-widgetcom.apple.ical.icscom.apple.icnscom.apple.installer-distribution- packagecom.apple.installer-packagecom.apple.keynote.keycom.apple.pictcom.apple.protected-mpeg-4-audio com.apple.quicktime-image … 80 Part I ■ Mac OS X Basics The possible categories include the following: LSRiskCategorySafe: Totally safe; Safari will auto-open after download LSRiskCategoryNeutral: No warning, but not auto-opened LSRiskCategoryUnsafeExecutable: Triggers a warning “This fi le is an application…” LSRiskCategoryMayContainUnsafeExecutable: This is for things like archives that contain an executable. It triggers a warning unless Safari can determine all the contents are safe or neutral These settings can be overridden by the contents of the files /Library/ Preferences/com.apple.DownloadAssessment.plst and ~/Library/Preferences/ com.apple.DownloadAssessment.plst, which represent changes on a system- wide or user level, respectively. Using this information, it is possible to deter- mine exactly which fi les Safari will automatically launch. Having Your Cake Safari’s ability to handle many fi le formats through plug-ins and being able to launch applications means that often it is possible for an attacker to choose which way they want their malicious content to be handled, either by Safari or by an accompanying application. For example, in Chapter 8, “Heap Overfl ows,” you’ll learn to write reliable exploits in Safari by using JavaScript. It might be convenient to exercise a vulnerability within Safari’s process space. If a bug is discovered that is exploitable only after hitting the Play button in QuickTime Player, it is still possible to exercise the bug in Safari. The following HTML code embeds in a web page any fi le that QuickTime Player can process, and plays it. Accessing this HTML will automatically play the movie , in this case good .mov. Any corruption will occur in the same process space as Safari (including the JavaScript heap). Conversely, if you would rather exploit a separate binary for this type of vulnerability, that is possible too. This might be necessary if Safari were in a sandbox (which it isn’t currently) or if you wanted to make some assumptions Chapter 3 ■ Attack Surface 81 about memory layout, since Safari may have visited thousands of sites and be in an unknown state, but a newly launched application might be in a predictable state. The key to this is the way that Safari handles many fi le types automati- cally, including gzip fi les. For many such fi les, if you access a gzip version of the fi le in Safari, it will automatically download, unzip it, and launch it in the default application for that type (according to LaunchServices). For example, if you’d rather exploit Preview than Safari with a GIF bug, simply gzip the image fi le and have the victim surf to the gzipped version. Safari will unzip it and render it with Preview. Conclusion A wise attacker will survey all the opportunities for attack and try the weakest spot. To do this, it is important to understand all the places where data enters the Mac OS X system. From the server side there aren’t many possibilities unless the user has enabled some additional software. From the client side, however, there are many ways to get data processed by a large number of client applica- tions and libraries. At this point it is up to the attacker to pick a spot and start looking for problems. The remainder of this book will outline how to fi nd a vulnerability in a particular bit of code and how to exploit it to gain control of the victim’s machine. References http://blog.washingtonpost.com/securityfix/2006/08/hijacking_a_ macbook_in_60_seco.html http://developer.apple.com/documentation/Carbon/Conceptual/ LaunchServicesConcepts/LaunchServicesConcepts.pdf http://www.macosxhints.com/article.php?story=20031215144430486 http://www.macosxhints.com/article.php?story=2004100508111340& query=LaunchServices http://unsanity.org/archives/000449.php http://support.apple.com/kb/HT2340?viewlocale=en_US http://macenterprise.org/content/view/201/84/ http://projects.info-pull.com/moab/MOAB-04-01-2007.html http://projects.info-pull.com/moab/MOAB-20-01-2007.html 86 Par t II ■ Discovering Vulnerabilities but does not allow for memory or registers to be read or written. Obviously a debugger without these functions would be useless. One other Mac OS X ptrace feature worth discussing is the PT_DENY_ ATTACH ptrace request. This nonstandard request, available only on the Mac OS X version of ptrace, can be set by an application and denies future requests for processes to attach to it. This is a simple anti-debugging mechanism imple- mented mostly for applications such as iTunes. We’ll discuss this more, as well as ways of circumventing it, later in the chapter. Good Ol’ GDB Aside from the peculiarities discussed in the previous section, GDB pretty much works as you would hope and expect on Leopard. This is because GDB in Mac OS X is not implemented via ptrace, but rather mostly using the Mach API. From the user’s point of view, this doesn’t matter. GBD just works; it dif- fers only behind the scenes. That said, there are a few Mac OS X–specifi c GDB features worth mentioning. There are a handful of Mach-specifi c commands available under the GDB info command. These allow you to get information about processes besides the one to which you might be attached and provide detailed information about the attached process as well. Consider this example: (gdb) info mach-tasks 65 processes: gdb-i386-apple-d is 1499 has task 0xe07 mdworker is 1430 has task 0x408f Preview is 1284 has task 0x1003 Pages is 1072 has task 0x418f Then, information about the processes can be obtained with commands such as,(gdb) info mach-task 0x418f TASK_BASIC_INFO: suspend_count: 0 virtual_size: 0x41647000 resident_size: 0x35e6000 TASK_THREAD_TIMES_INFO: (gdb) info mach-threads 0x418f Threads in task 0x418f: 0x5403 0x5503 0x5603 0x5703 0x5803 0x5903 0x5a03 0x5b03 Chapter 4 ■ Tracing and Debugging 87 0x5c03 0x5d03 0x5e03 0x5f03 0x6003 0x6103 The most useful of these commands are info mach-regions and info mach- region. The fi rst of these two commands gets all the information for mapped memory. (gdb) info mach-regions Region from 0x0 to 0x1000 (—-, max —-; copy, private, not-reserved) … from 0x1000 to 0xb2000 (r-x, max rwx; copy, private, not-reserved) … from 0xb2000 to 0xc8000 (rw-, max rwx; copy, private, not- reserved) (2 sub-regions) … This is useful for fi nding writable and executable sections of code during exploitation. It can also be used to fi nd large sections of mapped memory that you may have supplied as part of a heap spray (there’s more on this in Chapter 8, “Exploiting Heap Overfl ows”). The fi nal command is used to fi nd the current region in which a given address resides: (gdb) info mach-region 0xbfffee28 Region from 0xbfffe000 to 0xc0000000 (rw-, max rwx; copy, private, not- reserved) (2 sub-regions) DTrace DTrace is a tracing framework available in Leopard that was originally developed at Sun for use in Solaris. It allows users access to applications at an extremely low level and provides a way for users to trace programs and even change their execution fl ow. What’s even better is that in most circumstances there is very little overhead in using DTrace, so the process still runs at full speed. DTrace is powerful because the underlying operating system and any applications that support it have been modifi ed with special DTrace “probes.” These probes are placed throughout the kernel and are at locations such as the beginning and end of system calls. DTrace may request to perform a user-supplied action at any com- bination of these probes. The actions to be executed are written by the user using the D programming language, which will be discussed in the next section. When you call the dtrace command, behind the scenes the D compiler is invoked. The compiled program is sent to the kernel, where DTrace activates the probes required and registers the actions to be performed. Since all of this is done dynamically, the probes that are not needed are not enabled and so there 88 Part II ■ Discovering Vulnerabilities is little system slowdown. In other words, the traces are always in the kernel, but they perform actions only when enabled. D Programming Language D is basically a small subset of C that lacks many control-fl ow constructs and has some additional DTrace-specifi c functions. Each D program consists of a number of clauses, each one describing which probe to enable and which action to take when that probe fi res. The following is the obligatory “hello world” program in D. BEGIN { printf(“Hello world”); } Copy this into a fi le called hello.d and execute it with the following: $ sudo dtrace -s hello.d dtrace: script ‘hello.d’ matched 1 probe CPU ID FUNCTION:NAME 0 1 :BEGIN Hello world You’ll have to type Ctrl+C to exit the program. This program uses a special probe called BEGIN, which fi res at the start of each new tracing request. Many typical C-style operations and functions are available in D. See the following code. dtrace:::BEGIN { i = 0; } profile:::tick-1sec { i = i + 1; printf(“Currently at %d”, i); } profile:::tick-1sec /i==5/ { exit(0); } Here the tick-1sec probe fi res every second. Notice the predicate /i==5/, which tells DTrace to fi re only when the variable i has the value 5. Using predi- cates in this manner is the only way to affect the program fl ow conditionally; Chapter 4 ■ Tracing and Debugging 89 there are no if-then statements in D. Executing this tracing request gives the following output. $ sudo dtrace -s counter.d dtrace: script ‘counter.d’ matched 3 probes CPU ID FUNCTION:NAME 0 18648 :tick-1sec Currently at 1 0 18648 :tick-1sec Currently at 2 0 18648 :tick-1sec Currently at 3 0 18648 :tick-1sec Currently at 4 0 18648 :tick-1sec Currently at 5 0 18648 :tick-1sec Describing Probes Each probe has a human-readable name as well as a unique ID number. To see a list of all the available probes on a system, run the following command. $ sudo dtrace -l | more ID PROVIDER MODULE FUNCTION NAME 1 dtrace BEGIN 2 dtrace END 3 dtrace ERROR 4 lockstat mach_kernel lck_mtx_lock adaptive-acquire 5 lockstat mach_kernel lck_mtx_lock adaptive-spin … A provider is a kernel module that is responsible for carrying out the instru- mentation for particular probes. That is to say, each provider has a number of probes associated with it. The human-readable name consists of four parts: the provider, module, function, and name. The provider is responsible for instrumenting the kernel for its particular probes. The module name is the name of the kernel module for the probe or the name of the user library that contains the probe—for example, libSystem.B.dylib. The function is the one in which the probe is located. Finally, the name fi eld supplies additional information on the probe’s use. When writing out the name of a probe, all four parts are necessary, separated by colons. For example, a valid name of a probe would be fbt:mach_kernel:ptrace:entry One of the things that make DTrace powerful is that if you do not supply an entry for each fi eld in a probe name, DTrace applies the specifi ed action to all probes that match the remaining fi elds. This is a wildcard mechanism that is very useful. It takes a small amount of time for each probe request to be 90 Part II ■ Discovering Vulnerabilities enacted; however, this time penalty is approximately per request, not per probe! Therefore, enabling 100 probes through one clever use of a wildcard takes no more signifi cant up-front time than enabling a single probe. The following code shows how this wildcard usage of DTrace can be utilized: syscall:::entry /pid == $1/ { } This small but powerful DTrace script enables every probe from the syscall provider; that is, a probe at the beginning of each system call. Notice the use of the built-in variable pid, which specifi es the process identifi er (PID) of the process that invoked the probe. $1 is the fi rst argument passed to the program. Here is an example of this probe’s use: $ sudo dtrace -s truss.d 1284 dtrace: script ‘truss.d’ matched 427 probes CPU ID FUNCTION:NAME 1 18320 kevent:entry 1 18320 kevent:entry 1 18320 kevent:entry 0 17644 geteuid:entry 0 17644 geteuid:entry 0 17642 getuid:entry 0 17644 geteuid:entry 0 18270 stat64:entry 0 18270 stat64:entry Notice that due to the wildcard, with one line in this D program, 427 probes were activated. Example: Using Dtrace Now that you have a basic understanding of DTrace, let’s examine how to leverage it to provide information that will help in fi nding and exploiting bugs in Leopard. Suppose you want to monitor which fi les an application is accessing. This could be useful for tracing information, for seeing whether there is a directory- transversal attack during testing, or for identifying important confi guration fi les used by closed-source applications. To accomplish these tasks, in Windows there exists the Filemon utility. In Mac OS X there is fs_usage. Here we replicate the functionality in DTrace with fi lemon.d. syscall::open:entry /pid == $1 / { Chapter 4 ■ Tracing and Debugging 91 printf(“%s(%s)”, probefunc, copyinstr(arg0)); } syscall::open:return /pid == $1 / { printf(“\t\t = %d\n”, arg1); } syscall::close:entry /pid == $1/ { printf(“%s(%d)\n”, probefunc, arg0); } Running this simple tracing program reveals the fi les accessed by Preview. $ sudo dtrace -qs filemon.d 2060 open(/Users/cmiller/Library/Mail Downloads/MyTravelPlans.pdf) = 8 close(8) open(/.vol/234881026/1179352) = 8 close(8) open(/Applications/Preview.app/Contents/Resources/English.lproj/ PDFDocument.nib/keyedobjects.nib) = 8 close(8) open(/System/Library/Displays/Overrides/DisplayVendorID-610/ DisplayProductID-9c5f) = 8 close(8) open(/dev/autofs_nowait) = 8 open(/System/Library/Displays/Overrides/Contents/Resources/da.lproj/ Localizable.strings) = 9 close(9) close(8) Example: Using ltrace DTrace provides a simple way to follow which library calls are executed, like the useful ltrace utility in Linux. Here is a very simple DTrace program that will do something similar. Obviously a more complete version could be written. pid$target:::entry { ; } pid$target:::return { printf(“=%d\n”, arg1); } 92 Par t II ■ Discovering Vulnerabilities This script simply records when any function is called, and the return value of that function. By changing the script slightly, you could limit it to the functions within the main binary or just function calls from one library to another—for instance, WebKit to libSystem. That is the power of DTrace; it is completely confi gurable by the user. Here is this script in action against Safari. $ sudo dtrace –F -p 65527 -s ltrace.d 1 -> WTF::HashTable, WTF::IntHash, WTF::HashTraits, WTF::HashTraits >::remove(i 1 <- WTF::HashTable, WTF::IntHash, WTF::HashTraits, WTF::HashTraits >::remove(i =6 1 -> WebCore::TimerBase::heapDecreaseKey() 1 -> void std::__push_heap(WebCore::TimerHeapIterator, int, int, WebCore 1 <- void std::__push_heap(WebCore::TimerHeapIterator, int, int, WebCore =365032192 1 <- WebCore::TimerBase::heapDecreaseKey() =365032192 1 -> WebCore::updateSharedTimer() 1 <- WebCore::updateSharedTimer() =0 1 -> WebCore::stopSharedTimer() 1 -> CFRunLoopTimerInvalidate 1 -> CFRetain 1 <- CFRetain =0 1 -> _CFRetain 1 -> OSAtomicCompareAndSwapIntBarrier 1 <- OSAtomicCompareAndSwapIntBarrier =1 1 <- _CFRetain =367732064 1 -> spin_lock 1 -> spin_lock 1 -> CFDictionaryRemoveValue 1 -> __CFDictionaryFindBuckets1a 1 <- __CFDictionaryFindBuckets1a =238 1 <- CFDictionaryRemoveValue =1582186028 It takes about 30 seconds for all the probes to be enabled. More detailed information could be included, as well, but this example is intended to show you how only a few lines of D can dig into what an application is doing. Chapter 4 ■ Tracing and Debugging 93 Example: Instruction Tracer/Code-Coverage Monitor It is useful to know the code that an application is executing. Using DTrace, you can get either an instruction trace or an overall code-coverage report. Although you cannot hope to apply millions of probes (for example, at each basic block), you can perform less ambitious tasks, such as monitoring which functions or instructions within a function are being executed. The following is a probe that traces all the instructions executed within the jsRegExpCompile function within the JavaScriptCore library. This function has been responsible for a couple of high-profi le vulnerabilities in Safari. pid$target:JavaScriptCore:jsRegExpCompile*: { printf(“08%x\n”, uregs[R_EIP]); } Running this script with DTrace produces a list of the instructions executed in this function. $ sudo dtrace -qp 65567 -s instruction_tracer.d 089478a4e0 089478a4e0 089478a4e1 089478a4e3 089478a4e4 … Likewise, the following probe will trace all the functions called from the JavaScriptCore library. pid$target:JavaScriptCore::entry { printf(“08%x:%s\n”, uregs[R_EIP], probefunc); } Here is a sample of running it. $ sudo dtrace -qp 65567 -s instruction_tracer2.d 0894784cf0:WTF::fastMalloc(unsigned long) 0894787160:WTF::fastFree(void*) 0894787850:WTF::fastZeroedMalloc(unsigned long) 0894784cf0:WTF::fastMalloc(unsigned long) 0894787160:WTF::fastFree(void*) 089478f8e0:KJS::JSLock::lock() 089478f9a0:KJS::JSLock::registerThread() 089478f9b0:KJS::Collector::registerThread() 0894796910:KJS::JSObject::type() const 08947b3080:KJS::InternalFunctionImp::implementsCall() const 08947993f0:KJS::JSGlobalObject::globalExec() 0894799400:KJS::JSGlobalObject::startTimeoutCheck() 94 Par t II ■ Discovering Vulnerabilities 08947fd3f0:KJS::JSObject::call(KJS::ExecState*, KJS::JSObject*, KJS::List const&) 08947b90b0:KJS::FunctionImp::callAsFunction(KJS::ExecState*, KJS::JSObject*, KJS::List const&) 08947b92c0:KJS::FunctionExecState:: FunctionExecState(KJS::JSGlobalObject*, KJS::JSObject*, KJS::FunctionBodyNode*, KJS::ExecState*, KJS::F 08947b9430:KJS::JSGlobalObject::pushActivation(KJS::ExecState*) 08947b9530:KJS::ActivationImp::init(KJS::ExecState*) If you aren’t interested in the order of execution but purely in which functions or instructions are executed, you can use the following probes. For instructions within a function, we use the following: pid$target:JavaScriptCore:jsRegExpCompile*: { @code_coverage[uregs[R_EIP]] = count(); } END { printa(“0x%x : %@d\n”, @code_coverage); } Here we trace only the instructions within the jsRegExpCompile function in the JavaScriptCore framework. Of course, we could do this for any combination of functions or, for that matter, all instructions. The @ sign denotes a special aggrega- tion in D. This is an effi cient way for DTrace to collect data. The printa function is used to print aggregates, and the @ sign is used to print the corresponding aggre- gate value—in this case the number of times the probe was executed. Running this script against Safari reveals the following: $ sudo dtrace -p 4535 -qs code_coverage.d ^C 0x9714f4e1 : 6 0x9714f4e3 : 6 0x9714f4e4 : 6 0x9714f4e5 : 6 0x9714f4e6 : 6 0x9714f4e9 : 6 0x9714f4ec : 6 0x9714f4f1 : 6 0x9714f4f2 : 6 0x9714f4f5 : 6 0x9714f4f8 : 6 0x9714f4ff : 6 0x9714f501 : 6 0x9714f507 : 6 0x9714f50a : 6 … Chapter 4 ■ Tracing and Debugging 95 It doesn’t print anything until you quit DTrace, at which point it prints out all the instructions that were hit and the number of times each was executed. Here is the function-coverage program. pid$target:JavaScriptCore::entry { @code_coverage[probefunc] = count(); } With just a few lines of D you are able to replicate much of the functionality of Pai Mei, which is a reverse-engineering framework named after a character in the movie Kill Bill 2. We’ll discuss Pai Mei in more detail in the section “Binary Code Coverage with Pai Mei” later in this chapter. Here is an example of this probe in use. $ sudo dtrace -p 65567 -s code_coverage2.d dtrace: script ‘code_coverage2.d’ matched 2048 probes ^C KJS::CaseBlockNode::executeBlock(KJS::ExecState*, KJS::JSValue*) 1 KJS::Collector::collect() 1 KJS::Collector::markCurrentThreadConservatively() 1 KJS::Collector::markProtectedObjects() 1 KJS::Collector::markStackObjectsConservatively(void*, void*) 1 KJS::DoWhileNode::execute(KJS::ExecState*) 1 KJS::EmptyStatementNode::EmptyStatementNode() 1 KJS::EmptyStatementNode::isEmptyStatement() const 1 Example: Memory Tracer The fi nal example is useful for heap analysis. This program will allow you to watch as buffers are allocated and freed. In particular, you can watch particular size allocations, which might help you track down what is happening to the data you are passing into the target program. Additionally, stack backtraces could be printed for allocations that match the buffer size using the D function ustack(). pid$target::malloc:entry, pid$target::valloc:entry { allocation = arg0; } pid$target::realloc:entry { allocation = arg1; } 96 Par t II ■ Discovering Vulnerabilities pid$target::calloc:entry { allocation = arg0 * arg1; } pid$target::calloc:return, pid$target::malloc:return, pid$target::valloc:return, pid$target::realloc:return /allocation > 300 && allocation < 9000/ { printf(“m: 0x%x (0x%x)\n”, arg1, allocation); mallocs[arg1] = allocation; } This prints only allocations of sizes between 300 and 9,000 bytes. Running this against Safari provides the following output. m: 0x8bbe00 (0x250) f: 0x8bbe00 (0x250) m: 0x8bbe00 (0x250) f: 0x8bbe00 (0x250) m: 0x8bbe00 (0x250) f: 0x8bbe00 (0x250) m: 0x8bbe00 (0x250) f: 0x8bbe00 (0x250) m: 0x8bbe00 (0x250) m: 0x1726d810 (0x140) f: 0x1726d810 (0x140) m: 0x981200 (0x250) … PyDbg DTrace is a great way to look inside a process and see what is going on; however, it does have some limitations. In particular, the D programming language has defi ciencies with regard to conditional statements. Furthermore, DTrace is designed only to trace, and sometimes you may want to do something a little more com- plicated. For example, DTrace can’t do much with the virtual-memory layout of a process. Sometimes you want the options that only a full debugging session can provide. We already talked about GDB, which can be useful for simple things, but another tool exists: PyDbg. PyDbg was written as a pure Python Win32 debugger. Since it was written in Python, it could be accessed programmatically and also had access to all the existing Python libraries. In 2007 one of the authors of this book tried to port this library to Mac OS X, but it was very buggy and incomplete. A more complete version for Leopard is now available from the book’s website, www .wiley.com/go/machackershandbook. PyDbg can be used to do anything you might want to do with GDB, except it can also utilize all the power of Python. Chapter 4 ■ Tracing and Debugging 97 PyDbg Basics We’ll step through a very basic PyDbg script to show you how it works. The following Python script sets a breakpoint at the address passed as the second argument and dumps out the context whenever it is hit. #!python from pydbg import * def handler_breakpoint (pydbg): print ‘————————————————Dumping context’ print pydbg.dump_context() return DBG_CONTINUE dbg = pydbg() # register a breakpoint handler function. dbg.set_callback(EXCEPTION_BREAKPOINT, handler_breakpoint) dbg.attach(int(sys.argv[1])) dbg.bp_set(int(sys.argv[2], 16),””, 1) dbg.debug_event_loop() The fi rst line imports the PyDbg framework. The next bit of code defi nes a function called handler_breakpoint that takes a pydbg instance as an argu- ment. This function prints out the execution context of the process and then tells PyDbg the breakpoint exception has been handled. Next, the actual script begins. A pydbg instance is declared. Next, the handler_breakpoint function is set to handle breakpoint exceptions. The script then attaches to the process whose PID was passed as the fi rst argument and sets a breakpoint at the address passed as the second argument. The fi rst argument to the bp_set function is the address at which to place the breakpoint. The second is an optional description for the breakpoint. The fi nal argument is whether PyDbg should restore this breakpoint (once it is hit, determining whether the breakpoint should be removed or kept). Finally, the main PyDbg event-processing loop is entered. Running this example gives output similar to the following. $ python test.py 1324 0x00001fc3 ————————————————Dumping context ALLOCATE RETURNED WITH 9000 CONTEXT DUMP EIP: 00001fc3 mov eax,[ebp-0xc] EAX: 00000000 ( 0) -> N/A EBX: 00001fa6 ( 8102) -> N/A ECX: bffff6ac (3221223084) -> /z (stack) EDX: 96735b06 (2524142342) -> N/A 98 Par t II ■ Discovering Vulnerabilities EDI: 00000000 ( 0) -> N/A ESI: 00000000 ( 0) -> N/A EBP: bffff778 (3221223288) -> ....n.......................................................O...{...... .................2...T.......*...;...C...W...g......................... ......./test.../test.MANPATH=/sw/share/man:/Library/Frameworks/Python. framework/Versions/Current/man:/opt/local/sh (stack) ESP: bffff750 (3221223248) -> ....B...K...................C...............n........................... ............................O...{.......................2...T.......*... ;...C...W...g................................/test.../test.MANPATH=/sw/ share/man:/Library/Frameworks/Python.fram (stack) +00: 00000001 ( 1) -> N/A +04: 00000042 ( 66) -> N/A +08: 8fe0154b (2413827403) -> N/A +0c: 00001000 ( 4096) -> N/A +10: 00000000 ( 0) -> N/A +14: 00000000 ( 0) -> N/A … Now that you understand the basics of PyDbg, we’ll walk you through a few examples of its use to give a fl avor for the types of things it can do. The pos- sibilities are limited only by the user’s imagination. Memory Searching One of the features that GDB is missing on all platforms is the ability to search memory. There are many times when this capability would be useful, such as when searching memory to see where a fi le has been mapped, or looking for shellcode. Using PyDbg, this is rather simple. Consider the following PyDbg script: #!python from pydbg import * dbg = pydbg() dbg.attach(int(sys.argv[1])) dbg.search_memory(“PATH”) dbg.detach() This script simply performs the necessary prologue, attaches to a process specifi ed by the PID, searches memory for the string “PATH,” and then detaches from the process. This is all accomplished in basically four lines of Python. $ python test9.py 625 8fe25ca0: 4c 44 5f 46 52 41 4d 45 57 4f 52 4b 5f 50 41 54 LD_FRAMEWORK_PAT 8fe25cb0: 48 00 44 59 4c 44 5f 46 41 4c 4c 42 41 43 4b 5f H.DYLD_FALLBACK_ Chapter 4 ■ Tracing and Debugging 99 bffff830: 73 74 00 00 2e 2f 74 65 73 74 00 4d 41 4e 50 41 st…/test.MANPA bffff840: 54 48 3d 2f 73 77 2f 73 68 61 72 65 2f 6d 61 6e TH=/sw/share/man In this example, the script found two instances of the string “PATH” in memory. In-Memory Fuzzing In the next chapter, we will discuss the vulnerability-discovery technique known as fuzzing. This technique has been used to fi nd a variety of security issues in a wide range of programs. The basic idea is to send anomalous data into a program in an attempt to make it crash. One problem that comes up in fuzzing can be addressed with PyDbg. Namely, with fuzzing, we are limited to interacting only with the interfaces of the target, but sometimes we are inter- ested in a particular section of code located deep within the program. This issue may manifest itself in a number of ways. The data entering the program may be encrypted. Rather than reimplement the program’s encryption algorithm so that the inputs are passed as the target expects, it would be easier to fuzz the part of the program that deals with the unencrypted payload. The same argument holds true for complex, multistep protocols. If we really want to fuzz only one packet type, but to get to that portion of the protocol we fi rst need to send a number of complex packets, we will be doing much more work than we’d like. An example of this occurs with SSL, where a number of packets need to be exchanged before certain SSL packets are expected and processed. The same would be true in a shopping application. If we wanted to fuzz the code respon- sible for parsing a credit-card number, we’d have to design our fuzzer such that it authenticated to the application, selected some items for the shopping cart, checked out, and entered the shipping information, all before sending a single fuzzed credit-card number. Then it would have to clean up by removing items from the cart, logging out, etc. This is a lot of overhead when we’re interested in fuzzing only a few lines of code. The solution is to fuzz not the interface, but the actual code we are interested in. Consider the following simple application: #include #include void print_hi(int y){ char x[4]; memcpy(x, “hi”, 2); x[y] = 0; printf(“%s\n”, x); } 100 P a r t II ■ Discovering Vulnerabilities int main(int argc, char *argv[]){ getchar(); print_hi(atoi(argv[1])); } This program attempts to print out the word “hi” but allows the user to spec- ify where the terminating NULL should go in the fi rst argument to the program. The call to getchar() is there to allow you time to attach to the program, but isn’t necessary. This program could easily be fuzzed in the traditional method, at the interface (in this case via command-line arguments), but here it is an example of how to fuzz from within a program. You can do this by writing a PyDbg script. The basic idea is to take a snapshot of the memory and context at the beginning of the function print_hi, then execute that function many times with different inputs, being careful to restore the snapshot before each execution. In this way you get to try many values of inputs to the function print_hi but you have to send only one input to the program. PyDbg handles the rest. #!python from pydbg import * value = 0 def handler_badness (pydbg): global value print “Caused a fault with input %x” % value return DBG_EXCEPTION_HANDLED def handler_breakpoint (pydbg): global value if(pydbg.context.Eip == 0x00001fbc): pydbg.suspend_all_threads() pydbg.process_snapshot() pydbg.resume_all_threads() elif (pydbg.context.Eip == 0x00001ffc) : pydbg.suspend_all_threads() pydbg.process_restore() pydbg.write_process_memory(pydbg.context.Esp, struct.pack(‘L’, value)) pydbg.resume_all_threads() value = value + 1 else: pydbg.bp_set(0x00001ffc,””, 0 ) return DBG_CONTINUE dbg = pydbg() Chapter 4 ■ Tracing and Debugging 101 # register a breakpoint handler function. dbg.set_callback(EXCEPTION_BREAKPOINT, handler_breakpoint) dbg.set_callback(EXCEPTION_ACCESS_VIOLATION, handler_badness) dbg.attach(int(sys.argv[1])) dbg.bp_set(0x00001fbc,”Entry to function print_hi”,0 ) dbg.bp_set(0x00001fbf,”The next instruction after entry”,1 ) dbg.debug_event_loop() Take a closer look at this script. Again the script begins by importing PyDbg. Next it defi nes an exception handler, which simply prints out the value of the global variable value. The next function contains the meat of the script. The function can take three actions, depending on the value of the program counter at the moment the function is called. The fi rst action is for when the function print_hi is entered. In that case the handler function takes a memory snapshot of the process. This entails saving a copy of all the writ- able memory regions as well as the current values of the context (registers) for each of the threads. The second action occurs after the execution of the instruction that follows the taking of the snapshot. Keep in mind that this will be the fi r s t instruc- tion executed after the snapshot is restored. This sets a breakpoint at the fi rst instruction that is executed after the print_hi function returns—that is, when the function being fuzzed is complete. The third action occurs at this breakpoint, after the print_hi function com- pletes. At this point the function has executed completely and no problems have been found, or else the program would not have gone this far. The script now restores the snapshot and writes a new value for the argument to this func- tion, stored on the stack. It then continues execution (from where the snapshot occurred). Restoring the snapshot includes copying the stored memory regions to where they were read from and returning the context to its previous state. Finally, the script registers these functions for the appropriate exceptions, attaches to the process in question, and sets breakpoints at the fi rst and second instructions in the function. It then enters the event loop. Notice that you can’t set the fi nal breakpoint for after print_hi completes before the fi rst snapshot is taken. Otherwise you run into the strange situation where the breakpoint is included in the snapshot (a 0xCC is in memory, but PyDbg may no longer realize it is there). Setting the breakpoint dynamically, like this script does, removes any possibility of the debugger getting confused with breakpoints stored within the snapshot. Here is what running the program and attaching with the PyDbg Script looks like: $ ./test5 2 hi 102 P a r t II ■ Discovering Vulnerabilities h hi hi? hi?? hi??? hi???? hi????u hi????u? hi????u? hi????u?? hi????u??? hi????u???? hi????u????? hi????u?????? hi????u??????? hi????u???????? Bus error In the window running the fuzzer, you simply see the following output: Caused a fault with input 11 In this case you fuzzed with the simplest type, an integer, but you could have done things more intelligently, such as by trying all the powers of 2, or large and small values, or other possibilities. For other types, such as strings (char *), each time you want to run the function being tested, you can allocate some space in the process being tested, write the string to this new space, and replace the pointer being passed to the function with a pointer to your new string. Binary Code Coverage with Pai Mei Another situation in which DTrace fails is when you want to perform actions at hundreds (or thousands) of different places. It simply takes too long to acti- vate that number of probes. An example of this is when you want to perform actions at each basic block, such as when collecting code coverage in binaries. For this, you would like to set a breakpoint at each basic block in a program. Then, by observing which breakpoints were hit, you would know which basic blocks were executed, and thus you would have your code-coverage informa- tion without requiring source code. Code coverage can be useful during testing because it helps indicate the sec- tions of code that have not been tested. Code-coverage information has other uses, as well. For example, when reverse-engineering a binary, you can isolate the function for which various pieces of the executable are responsible. In this manner, you are able to break up large binaries into smaller pieces that are more manageable. This can be helpful when trying to fi gure out why a particular Chapter 4 ■ Tracing and Debugging 103 binary crashes on a given input. We’ll spend more time on reverse engineering in this manner in Chapter 6, “Reverse Engineering.” Pai Mei is a reverse-engineering framework built on top of PyDbg (Figure 4-1). Since PyDbg now works on Mac OS X, we get Pai Mei for free. One of the most useful Pai Mei modules is called pstalker, or Process Stalker. This module does exactly what we have been discussing; it can set breakpoints at each function or basic block and record which are hit when tested. We’ll walk through a complete example of how to use this tool in Mac OS X. Figure 4-1: An overview of the Pai Mei architecture As an example of how you might use Pai Mei to isolate the portion of an executable that performs a particular action, consider the Calculator program that comes installed in Mac OS X. Suppose you wanted to know exactly which basic blocks in the binary were responsible for the + button (that is to say, only the basic blocks that are executed when the + button is pushed). One way to fi nd this information would be to spend many hours (or days) statically reverse- engineering the binary and associated libraries in an attempt to understand exactly how the program works. Another approach is to use Pai Mei to get the answer in a few minutes. The fi rst thing you need to do to use Pai Mei is to tell it where all the basic blocks from the binary begin—that is, where it should set the breakpoints. The way to do this is through IDA Pro (http://www.hex-rays.com/idapro/) a com- mercial disassembler. For over a year, IDA Pro has had excellent support for Mach-O universal binaries. Unfortunately, IDA Pro runs only in Windows, so you’ll need a computer with Windows or a virtual machine running Windows for this step. Pai Mei works on individual libraries or binaries, so you’ll have to 104 P a r t II ■ Discovering Vulnerabilities decide which one to start with (you can include multiple ones, if you wish). The following code uses otool to get a list of the shared libraries Calculator uses. $ otool -L /Applications/Calculator.app/Contents/MacOS/Calculator /Applications/Calculator.app/Contents/MacOS/Calculator: /System/Library/Frameworks/Cocoa.framework/Versions/A/Cocoa (compatibility version 1.0.0, current version 12.0.0) /System/Library/PrivateFrameworks/SpeechDictionary.framework/Versions/A/ SpeechDictionary (compatibility version 1.0.0, current version 1.0.0) /System/Library/PrivateFrameworks/SpeechObjects.framework/Versions/A/ SpeechObjects (compatibility version 1.0.0, current version 1.0.0) /System/Library/Frameworks/SystemConfiguration.framework/Versions/A/ SystemConfiguration (compatibility version 1.0.0, current version 204.0.0) /System/Library/PrivateFrameworks/Calculate.framework/Versions/A/ Calculate (compatibility version 1.0.0, current version 1.0.0) /System/Library/Frameworks/ApplicationServices.framework/Versions/A/ ApplicationServices (compatibility version 1.0.0, current version 34.0.0) /usr/lib/libgcc_s.1.dylib (compatibility version 1.0.0, current version 1.0.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 111.0.0) /usr/lib/libobjc.A.dylib (compatibility version 1.0.0, current version 227.0.0) /System/Library/Frameworks/CoreFoundation.framework/Versions/A/ CoreFoundation (compatibility version 150.0.0, current version 476.0.0) /System/Library/Frameworks/AppKit.framework/Versions/C/AppKit (compatibility version 45.0.0, current version 949.0.0) /System/Library/Frameworks/Foundation.framework/Versions/C/Foundation (compatibility version 300.0.0, current version 677.0.0) Of these, the Framework called Calculate seems most promising, so select that one. Grabbing that fi le, transferring it to a Windows computer with IDA Pro, and dragging it onto the IDA Pro icon starts the disassembly. Immediately, IDA Pro recognizes it is a universal binary and asks which archi- tecture you want to examine; see Figure 4-2. Select Fat Mach-O File, 3. I386. After a few seconds, IDA Pro will complete its disassembly. At this point you can take advantage of an IDA Pro add-on called IDAPython (http://d-dome.net/idapy- thon/) that allows Python scripts to be run within IDA Pro. Pai Mei comes with one called pida_dump.py. Select File ➢ Python File ➢ pida_dump.py. It will ask what level of analysis you require. For this project, choose basic blocks. Answer no to the next two dialogues that concern API calls and RPC interfaces. Finally, save the resulting fi le as Calculate.pida. PIDA fi les are binary fi les that contain the information Pai Mei needs for a given binary. Within Python, these contents can be accessed with the pida module: Chapter 4 ■ Tracing and Debugging 105 #!python import pida p = pida.load(“Calculator.pida”); for f in p.nodes.values(): print “Function %s starts at %x and ends at %x” % (f.name, f.ea_start, f.ea_end) for bb in f.nodes.values(): print “ Basic block %x” % bb.ea_start Figure 4-2: IDA Pro dissects the library. Executing this script gives a list of the address of every basic block from the Calculate shared library, and each function. Function _memcpy starts at c203 and ends at c207 Basic block c203 Function _calc_yylex starts at 6605 and ends at 73ad Basic block 7200 Basic block 7003 … Now that you have the necessary PIDA fi le, it is time to fi re up Pai Mei and get to work. Start it from the command line. $ python PAIMEIconsole.pyw Click on the PAIMEIpstalker icon. Pai Mei stores all of its information in a MySQL database. Connect to it by selecting Connections ➢ MySQL Connect. Next, load the PIDA fi le you created earlier by pressing the Add Module(s) button. 106 P a r t II ■ Discovering Vulnerabilities Now you need to create a couple of targets. The basic idea to discover what code is exclusively related to the + button is fi rst to fi nd code that is not associ- ated with the + button. Then record the code executed when you press the + button, and remove any of the hits that were executed when you didn’t press the + button. Pai Mei has exactly this functionality. Right-click on Available Targets and select Add Target. Call it Calculator. Then right-click on that and select Add Tag. Create two tags, one called not-plus-button and another called plus-button-only. Right-click on not-plus-button and pick Use for Stalking. Then press the Refresh Process List button and fi nd the Calculator process. Click the radio button next to Basic for basic blocks. Uncheck the box marked Heavy. This setting is if you wish to record the context at each breakpoint. You care only about code coverage, so this is not necessary. Finally, press the Start Stalking button. It should say something like Setting 936 breakpoints on basic blocks in Calculate Now start doing things within the Calculator application, except do not hit the + button. Do simple math, use the memory functions, and move the application around. As you perform actions, you’ll see breakpoints being hit within the Pai Mei GUI. The more breakpoints that are hit, the faster the application will go as more and more of the breakpoints will already be hit (and removed). When you can’t hit any more breakpoints, press the Stop Stalking button. Pai Mei will export all those hits into the MySQL database. You’ll see something like the following in the Pai Mei console window. Exporting 208 hits to MySQL Those are basic blocks that are not associated strictly with the + button in calculator. Now right-click the plus-button-only tag and pick Use for Stalking. Right click the not-plus-button tag and pick Filter Tag. This means “don’t set any breakpoints on any of the hits in this tag.” Therefore, any breakpoints hit will necessarily only have to do with the + button. Press the Start Stalking button again. In Calculator, do a simple addition. Press Stop Stalking. To see these hits in the Pai Mei GUI, right-click on the plus-button-only tag and select Load Hits. You screen will look something like Figure 4-3. You’ll see that only four basic blocks were hit and they all seem to be in the same function. We can export these results into IDA Pro and look at them graphically. Right-click the plus-button-only tag again and select Export to IDA. This will create an IDC fi le, which is a script that IDA Pro understands. Now, back in IDA Pro, click File ➢ IDC File, and then select the fi le you just created. All the basic blocks that Pai Mei found were executed are now colored in within IDA Pro (see Figure 4-4). In this case, all the basic blocks executed are from one function, named _functionAddDecimal. It looks like you found the code responsible for the + button! Chapter 4 ■ Tracing and Debugging 107 Figure 4-3: The Pai Mei GUI displays the basic blocks associated with the + button. Figure 4-4: IDA Pro displaying the basic blocks executed by the + button 108 P a r t II ■ Discovering Vulnerabilities iTunes Hates You As discussed previously, iTunes has certain anti-debugging features built into it. Namely, it is not possible to attach or trace to the process using GDB or DTrace. Observe what happens if you try to attach to iTunes using GDB: (gdb) attach 1149 Attaching to process 1149. Segmentation fault This is because iTunes issues the ptrace PT_DENY_ATTACH request when it starts up and at other times within its lifetime. The man page for ptrace explains: PT_DENY_ATTACH This request is the other operation used by the traced process; it allows a process that is not currently being traced to deny future traces by its parent. All other arguments are ignored. If the process is currently being traced, it will exit with the exit status of ENOTSUP; otherwise, it sets a fl ag that denies future traces. An attempt by the parent to trace a process which has set this fl ag will result in a segmentation violation in the parent. Trying to attach to iTunes with GDB (or any ptrace-like debugger) causes it to die with a segmentation violation—how rude! Trying to run a DTrace script against iTunes doesn’t crash, but doesn’t actually turn on the probes. From DTrace’s perspective, absolutely nothing is happening within iTunes! Presumably, this anti-debugging feature is to protect Apple’s DRM. This mechanism is enforced in the kernel. Checking out the XNU source code reveals the magic. You see in the fi le bsd/kern/mach_process.c the following code for the ptrace system call. if (uap->req == PT_DENY_ATTACH) { proc_lock(p); if (ISSET(p->p_lflag, P_LTRACED)) { proc_unlock(p); exit1(p, W_EXITCODE(ENOTSUP, 0), retval); /* drop funnel before we return */ thread_exception_return(); /* NOTREACHED */ } SET(p->p_lflag, P_LNOATTACH); proc_unlock(p); return(0); } Chapter 4 ■ Tracing and Debugging 109 When a process issues the PT_DENY_ATTACH request, it exits if it is cur- rently being traced; otherwise it sets the P_LNOATTACH fl ag for the process. Later in the same function, if a process tries to attach to a process with the P_LNOATTACH fl ag set, it segfaults. if (uap->req == PT_ATTACH) { … if (ISSET(t->p_lflag, P_LNOATTACH)) { psignal(p, SIGSEGV); } As for DTrace, the bsd/dev/dtrace/dtrace.c fi le shows what happens. #if defined(__APPLE__) /* * If the thread on which this probe has fired belongs to a process marked P_LNOATTACH * then this enabling is not permitted to observe it. Move along, nothing to see here. */ if (ISSET(current_proc()->p_lflag, P_LNOATTACH)) { continue; } #endif /* __APPLE__ */ This comes from the dtrace_probe() function that the provider calls to fi re a probe. If the process has set the P_LNOATTACH flag, DTrace doesn’t do anything. Luckily, this mechanism is easily circumvented. In Chapter 12, “Rootkits,” we’ll show you a method which could be used to defeat it using kernel modules. For now we can use GDB manually. The basic idea is to ensure that iTunes never (successfully) calls ptrace() with the PT_DENY_ATTACH request. We’ll inter- cept this function call in the debugger and make sure that when the parameter PT_DENY_ATTACH is passed; the function doesn’t do anything. To accomplish this goal, make sure iTunes isn’t running, start up GDB, and set a conditional breakpoint at ptrace(). (Really, this is overkill, because iTunes has no business calling ptrace(), but better safe than sorry.) Then, when it hits, have GDB make the function return without actually executing. Place these commands in a GDB init fi le. break ptrace condition 1 *((unsigned int *) ($esp + 4)) == 0x1f commands 1 return c end 110 P a r t II ■ Discovering Vulnerabilities You simply set a breakpoint at ptrace, and when it is hit you tell GDB to return to the previous function in the call chain, thus not executing the ptrace code. After starting iTunes, you can safely detach from the process and debug/trace to your heart’s content. $ gdb /Applications/iTunes.app/Contents/MacOS/iTunes GNU gdb 6.3.50-20050815 (Apple version gdb-768) (Tue Oct 2 04:07:49 UTC 2007) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type “show copying” to see the conditions. There is absolutely no warranty for GDB. Type “show warranty” for details. This GDB was configured as “i386-apple- darwin”…/Users/cmiller/.gdbinit:2: Error in sourced command file: No symbol table is loaded. Use the “file” command. Reading symbols for shared libraries ........................ done (gdb) source itunes.gdb Breakpoint 1 at 0xf493b24 (gdb) run Starting program: /Applications/iTunes.app/Contents/MacOS/iTunes Reading symbols for shared libraries +++++++++++++++++++++++................................................. ................................ done Breakpoint 1 at 0x960ebb24 Breakpoint 1, 0x960ebb24 in ptrace () Reading symbols for shared libraries .. done Reading symbols for shared libraries . done Reading symbols for shared libraries . done … ^C Program received signal SIGINT, Interrupt. 0x960b04a6 in mach_msg_trap () (gdb) detach Detaching from program: `/Applications/iTunes.app/Contents/MacOS/iTunes’, process 6340 local thread 0x2d03. Notice how the breakpoint is hit early in the processes lifetime. You now have a running iTunes and it doesn’t have the evil P_LNOTRACE fl ag set. This means you can attach to it again at your leisure. $ gdb -p 3757 GNU gdb 6.3.50-20050815 (Apple version gdb-768) (Tue Oct 2 04:07:49 UTC 2007) Chapter 4 ■ Tracing and Debugging 111 Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type “show copying” to see the conditions. There is absolutely no warranty for GDB. Type “show warranty” for details. This GDB was configured as “i386-apple- darwin”./Users/cmiller/.gdbinit:2: Error in sourced command file: No symbol table is loaded. Use the “file” command. /Users/cmiller/Desktop/3757: No such file or directory. Attaching to process 3757. Reading symbols for shared libraries . done Reading symbols for shared libraries ....................................................................... ...................................................................... done 0x967359e6 in mach_msg_trap () (gdb) DTrace works as well now, as apparently iTunes is displaying an episode of Chuck from Season 1: $ sudo dtrace -qs filemon.d 3757 open(/dev/autofs_nowait) = 20 open(/System/Library/Keyboard Layouts/AppleKeyboardLayouts.bundle/Contents/Info.plist) = 21 close(21) close(20) open(/dev/autofs_nowait) = 20 open(/System/Library/Keyboard Layouts/AppleKeyboardLayouts.bundle/Contents/Resources/English.lproj/ InfoPlist.strings) = 21 close(21) close(20) close(20) open(/.vol/234881026/6117526/07 Chuck Versus the Alma Mater.m4v) = 20 Order is restored to the universe. Conclusion Before diving in to learn about exploitation techniques, it is important to know how to dig into the internals of applications. We discussed GDB and ptrace on Mac OS X and how it differs from more-common implementations. We then 112 P a r t II ■ Discovering Vulnerabilities talked about the DTrace mechanism built into the kernel. DTrace allows kernel- level runtime application tracing. We wrote several small D programs that per- formed some useful functions for a security researcher, such as monitoring fi le usage, system calls, and memory allocations. The next topic was the Mac OS X port of PyDbg. This allowed us to write several Python scripts that performed debugging functions. The scripts included such things as searching memory and in-memory fuzzing. We also showed how Pai Mei could be used to help reverse-engineer a binary. Finally we discussed and showed how to circumvent Leopard’s attempt at anti-debugging. References http://landonf.bikemonkey.org/code/macosx/Leopard_PT_DENY_ ATTACH.20080122.html http://www.phrack.com/issues.html?issue=63&id=5 http://steike.com/code/debugging-itunes-with-gdb/ http://www.sun.com/bigadmin/content/dtrace/ http://www.mactech.com/articles/mactech/Vol.23/23.11/ ExploringLeopardwithDTrace/index.html http://dlc.sun.com/pdf/817-6223/817-6223.pdf http://www.blackhat.com/presentations/bh-dc-08/Beauchamp- Weston/Whitepaper/bh-dc-08-beauchamp-weston-WP.pdf https://www.blackhat.com/presentations/bh-usa-07/Miller/ Whitepaper/bh-usa-07-miller-WP.pdf http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2007-3944 http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-1026 114 P a r t II ■ Discovering Vulnerabilities approach to uncover these hard-to-fi nd bugs. Since it is diffi cult to write about instinct, we will spend some time introducing various techniques for fi nding software bugs. The majority of these techniques will be valid for any software (or hardware), but when possible we will discuss the particular tools available to carry them out on Leopard. We’ll also discuss some ways to fi nd bugs eas- ily by taking advantage of some of the intricacies of the way Apple designs, develops, and tests its software. In general, there are two methods of searching for bugs in software: static and dynamic. In static analysis, the source code or a disassembly of the binary is analyzed for problems. This may be done with tools that look for various common errors, such as buffer overfl ows, or by hand. Even in the presence of sophisticated tools, at some point an experienced analyst will have to sort through the results and fi gure out which of the identifi ed areas of code are actu- ally vulnerabilities. Sometimes this may be as diffi cult as fi nding the potential problem in the fi rst place. For example, consider the following function: char *foo(char *src, int len){ char *ret = malloc(len); strcpy(ret, src); return ret; } It is impossible to comment on the security of this function in isolation. It cer- tainly has the potential to be problematic, but it might take signifi cant effort to determine whether a user has control over the inputs to this function. Can a user control src? Can the user control len? Most importantly, can a user control src and len independently? These are some of the diffi culties with static analysis. On the other hand, dynamic analysis, often called fuzzing, consists of send- ing invalid inputs to the program and observing whether critical errors occur. Invalid inputs for an HTTP GET request could consist of the following: GET / HTTP/1.0000 GET //////////////////HTTP/1.0 GET / HT%n%nP/1.0 … Obviously, there are infi nite such inputs to try. Dynamic analysis carries the advantage of not having false positives. If the program crashes, it crashes. However, dynamic analysis does not usually understand the internals of the program. For example, fuzzing consists of testing an application with invalid inputs. If these inputs are too abnormal, the program may quickly reject them, and so only a few functions of the program will actually be tested. An example of this might be a checksum that is incorrect. Likewise, if the inputs are not invalid enough, they may not cause any problems in the program under test. It can be very diffi cult to fi nd the right balance and generate the most effective fuzzed inputs. Chapter 5 ■ Finding Bugs 115 Oftentimes, the best solution is to use a combination of these two techniques. Use static analysis to fi nd suspicious-looking areas of code and then use dynamic analysis to try to test these regions. Or use dynamic analysis to fi nd areas of code that are hard to reach and thus hard to test, and then analyze those methods carefully using static techniques. This latter method is often helped with the use of code coverage, which we will cover shortly. Old-School Source-Code Analysis One of the oldest approaches of static analysis consists of simply reading the source code and looking for problems. Some of Apple’s code is open source. Unfortunately, most of it isn’t. In general, the nongraphical components of the operating system (Darwin)—including the kernel, command-line utili- ties, system daemons, and shared libraries—tend to be open source. The GUI applications and libraries in Mac OS X are almost exclusively closed source. Nevertheless, they make use of open-source libraries and frameworks. For example, Safari is closed source, but relies heavily on the WebKit framework, which is open source. The following is an incomplete list of programs with security implications for which the source code is available. For a more detailed list, check out http://www.opensource.apple.com/darwinsource/. WebKit ■ mDNSResponder ■ SecurityTokend ■ dyld ■ launchd ■ XNU ■ Some notable exceptions to the open-source policy include QuickTime Player, Preview, Mail, iTunes, and others. With the source code available, a dedicated attacker can simply sit down and start reading through it, looking for bugs. This doesn’t require any specialized tools or techniques, just a little skill and a lot of patience. Getting to the Source The Apple open-source site tends to be a little outdated, but Apple’s source-code repositories are always up-to-date. The following are two examples of how to get the source code using CVS and SVN. 116 P a r t II ■ Discovering Vulnerabilities To get most projects, CVS can be used. Here is an example of downloading mDNSResponder: export CVSROOT=:pserver:anonymous@anoncvs.opensource.apple.com:/cvs/root $ cvs login Logging in to :pserver:anonymous@anoncvs.opensource.apple.com:2401/cvs/root CVS password: anonymous $ cvs co mDNSResponder To get WebKit, use the WebKit SVN server: $ svn checkout http://svn.webkit.org/repository/webkit/trunk WebKit From here, the source code is available to be read, audited, and compiled. For an exhaustive treatment of fi nding vulnerabilities in source code, consult The Art of Software Security Assessment: Identifying and Preventing Software Vulnerabilities (Addison-Wesley, 2006). Keep in mind that the source code is often newer than the actual binaries found in Leopard on the system. More on that in a bit. Code Coverage Code coverage is used to determine which lines of code in an application have been executed. This has been used for years by testers and quality-control engi- neers to fi nd which code has been tested and which hasn’t. Security researchers can take advantage of it, too. Consider the case of code coverage used in con- junction with dynamic analysis, i.e., fuzzing. After fuzzing the system under test, code-coverage information can be obtained. This information can be used to fi nd which portions of the code have not been tested yet with the fuzzing. (It cannot determine, in a meaningful way, whether a given executed line has been well tested, but it can determine which lines have not been tested). Such information can be used in refi ning the fuzzed inputs to improve their quality and execute additional code. Furthermore, fi nding the untested lines means they can be analyzed more carefully statically, or the dynamic analysis can be suitably improved to test those sections. Either way, code coverage can be a useful metric to analyze dynamic testing. Therefore, one thing you can do with the Apple source code, besides read it, is to collect code-coverage information on it. For example, the WebKit regres- sion-testing page (http://webkit.org/quality/testing.html) states the following: If you are making changes to JavaScriptCore, there is an additional test suite you must run before landing changes. This is the Mozilla JavaScript test suite. Chapter 5 ■ Finding Bugs 117 Since WebKit is a very big project to look through for bugs, it might help to focus on the areas that are not well tested with these regression tests. That is to say, some code is not as well tested as others and the code that is not well tested probably has more bugs to fi nd. To collect code-coverage information, WebKit needs to be built with the proper fl ags. $ WebKit/WebKitTools/Scripts/build-webkit –coverage This should build the whole package with code-coverage information built in, i.e., with the GCC fl ags -fprofi le-arcs and -ftest-coverage. The build will likely fail at one point with an error complaining that warnings are treated as errors. In that case, you have to fi nd and remove the -Werror fl ag from the compilation. For example, open the Xcode project fi le JavaScriptGlue.xcodeproj. Select Project ➯ Edit Project Settings and unclick the box by Treat Warnings as Errors. Make sure Confi guration is set to All Confi gurations. Then quit Xcode and rebuild the WebKit project. It should build all the way through without errors. The build succeeds if you see a message like the following: =========================================================== WebKit is now built. To run Safari with this newly-built code, use the “WebKitTools/Scripts/run-safari” script. NOTE: WebKit has been built with SVG support enabled. Safari will have SVG viewing capabilities. Your build supports the following (optional) SVG features: * Basic SVG animation. * SVG foreign object. * SVG fonts. * SVG as image. * SVG