Programming Google App Engine 编程手册

daixuf

贡献于2012-07-26

字数:0 关键词: 分布式/云计算/大数据 手册 Go

Download at WoweBook.Com Programming Google App Engine Download at WoweBook.Com Download at WoweBook.Com Programming Google App Engine Dan Sanderson Beijing • Cambridge • Farnham • Köln • Sebastopol • Taipei • Tokyo Download at WoweBook.Com Programming Google App Engine by Dan Sanderson Copyright © 2010 Dan Sanderson. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://my.safaribooksonline.com). For more information, contact our corporate/institutional sales department: (800) 998-9938 or corporate@oreilly.com. Editor: Mike Loukides Production Editor: Sumita Mukherji Proofreader: Sada Preisch Indexer: Ellen Troutman Zaig Cover Designer: Karen Montgomery Interior Designer: David Futato Illustrator: Robert Romano Printing History: November 2009: First Edition. Nutshell Handbook, the Nutshell Handbook logo, and the O’Reilly logo are registered trademarks of O’Reilly Media, Inc. Programming Google App Engine, the image of a waterbuck, and related trade dress are trademarks of O’Reilly Media, Inc. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and O’Reilly Media, Inc., was aware of a trademark claim, the designations have been printed in caps or initial caps. While every precaution has been taken in the preparation of this book, the publisher and author assume no responsibility for errors or omissions, or for damages resulting from the use of the information con- tained herein. TM This book uses RepKover™, a durable and flexible lay-flat binding. ISBN: 978-0-596-52272-8 [M] 1257864694 Download at WoweBook.Com For Lisa Download at WoweBook.Com Download at WoweBook.Com Table of Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii 1. Introducing Google App Engine ........................................... 1 The Runtime Environment 2 The Static File Servers 4 The Datastore 4 Entities and Properties 5 Queries and Indexes 6 Transactions 6 The Services 8 Google Accounts 9 Task Queues and Cron Jobs 9 Developer Tools 10 The Administration Console 11 Things App Engine Doesn’t Do...Yet 12 Getting Started 13 2. Creating an Application . . . .............................................. 15 Setting Up the SDK 15 Installing the Python SDK 16 Installing the Java SDK 20 Developing the Application 24 The User Preferences Pattern 24 Developing a Python App 25 Developing a Java App 39 The Development Console 54 Registering the Application 55 The Application ID and Title 57 Setting Up a Domain Name 58 Google Apps and Authentication 59 Uploading the Application 60 vii Download at WoweBook.Com Introducing the Administration Console 61 3. Handling Web Requests . . . .............................................. 63 The App Engine Architecture 64 Configuring the Frontend 66 Configuring a Python App 66 Configuring a Java App 68 Domain Names 69 App IDs and Versions 70 Request Handlers 72 Static Files and Resource Files 75 Secure Connections 81 Authorization with Google Accounts 83 How the App Is Run 85 The Python Runtime Environment 86 The Java Runtime Environment 87 The Sandbox 88 App Caching 89 Logging 93 Quotas and Limits 96 Request Limits 96 CPU Limits 97 Service Limits 98 Deployment Limits 98 Billable Quotas 100 Resource Usage Headers 101 4. Datastore Entities . . . .................................................. 103 Entities, Keys, and Properties 104 Introducing the Python Datastore API 105 Introducing the Java Datastore API 108 Property Values 110 Strings, Text, and Blobs 112 Unset Versus the Null Value 112 Multivalued Properties 113 Keys and Key Objects 114 Using Entities 116 Getting Entities Using Keys 116 Inspecting Entity Objects 117 Saving Entities 118 Deleting Entities 119 viii | Table of Contents Download at WoweBook.Com 5. Datastore Queries . . . .................................................. 121 Queries and Kinds 122 Query Results and Keys 122 GQL 123 The Python Query API 126 The Query Class 127 GQL in Python 128 Retrieving Results 129 Keys-Only Queries 131 The Java Query API 132 Keys-Only Queries in Java 133 Introducing Indexes 134 Automatic Indexes and Simple Queries 136 All Entities of a Kind 137 One Equality Filter 137 Greater-Than and Less-Than Filters 138 One Sort Order 139 Queries on Keys 141 Kindless Queries 142 Custom Indexes and Complex Queries 143 Multiple Sort Orders 143 Filters on Multiple Properties 144 Multiple Equality Filters 147 Not-Equal and IN Filters 150 Unset and Nonindexed Properties 150 Sort Orders and Value Types 152 Queries and Multivalued Properties 153 A Simple Example 153 MVPs in Python 154 MVPs and Equality Filters 155 MVPs and Inequality Filters 156 MVPs and Sort Orders 157 Exploding Indexes 159 Configuring Indexes 159 Index Configuration for Python 160 Index Configuration for Java 161 6. Datastore Transactions . . . ............................................. 163 Entities and Entity Groups 165 Keys, Paths, and Ancestors 166 Ancestor Queries 167 What Can Happen in a Transaction 168 Transactional Reads 169 Table of Contents | ix Download at WoweBook.Com Transactions in Python 169 Transactions in Java 172 How Entities Are Updated 175 How Entities Are Read 178 Batch Updates 179 How Indexes Are Updated 180 7. Data Modeling with Python ............................................ 183 Models and Properties 184 Property Declarations 185 Property Value Types 186 Property Validation 187 Nonindexed Properties 188 Automatic Values 189 List Properties 190 Models and Schema Migration 191 Modeling Relationships 192 One-to-Many Relationships 195 One-to-One Relationships 195 Many-to-Many Relationships 196 Model Inheritance 198 Queries and PolyModels 199 Creating Your Own Property Classes 200 Validating Property Values 201 Marshaling Value Types 202 Customizing Default Values 204 Accepting Arguments 205 8. The Java Persistence API . . . ............................................ 207 Setting Up JPA 208 Entities and Keys 209 Entity Properties 212 Embedded Objects 213 Saving, Fetching, and Deleting Objects 214 Transactions in JPA 216 Queries and JPQL 217 Relationships 220 For More Information 225 9. The Memory Cache . . . ................................................. 227 The Python Memcache API 228 Setting and Getting Values in Python 229 Setting and Getting Multiple Values 230 x | Table of Contents Download at WoweBook.Com Memcache Namespaces 231 Cache Expiration 231 Deleting Keys 232 Memcache Counters 233 Cache Statistics 233 The Java Memcache API 234 10. Fetching URLs and Web Resources .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 Fetching URLs in Python 240 Fetching URLs in Java 242 Asynchronous Requests in Python 244 RPC Objects 246 Processing Results with Callbacks 247 11. Sending and Receiving Mail and Instant Messages . . . . . . . . . . . . . . . . . . . . . . . . . 251 Enabling Inbound Services 253 Sending Email Messages 254 Sender Addresses 255 Recipients 256 Attachments 257 Sending Email in Python 258 Sending Email in Java 261 Receiving Email Messages 263 Receiving Email in Python 264 Receiving Email in Java 266 Sending XMPP Messages 267 Sending a Chat Invitation 269 Sending a Chat Message 270 Checking a Google Talk User’s Status 271 Receiving XMPP Messages 272 Receiving XMPP Messages in Python 273 Receiving XMPP Messages in Java 275 12. Bulk Data Operations and Remote Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277 Setting Up the Remote API for Python 278 Setting Up the Remote API for Java 279 Using the Bulk Loader Tool 280 Installing SQLite 280 Backup and Restore 281 Uploading Data 282 Downloading Data 286 Controlling the Bulk Loader 289 Using the Remote Shell Tool 290 Table of Contents | xi Download at WoweBook.Com Using the Remote API from a Script 291 13. Task Queues and Scheduled Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293 Task Queues 294 Processing Rates and Token Buckets 295 Elements of a Task 296 Task Handlers and Retries 297 Testing and Managing Tasks 299 Using Task Queues in Python 299 Using Task Queues in Java 304 Transactional Task Enqueueing 307 Scheduled Tasks 308 14. The Django Web Application Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 Installing Django 314 Creating a Django Project 315 The Request Handler Script 316 The Django App Engine Helper 317 Creating a Django Application 320 Using App Engine Models With Django 322 Using Django Unit Tests and Fixtures 324 Using Django Forms 327 15. Deploying and Managing Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333 Uploading an Application 334 Using Versions 335 Managing Service Configuration 337 Managing Indexes 337 Browsing and Downloading Logs 339 Inspecting the Datastore 342 Application Settings 342 Managing Developers 343 Quotas and Billing 344 Getting Help 345 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347 xii | Table of Contents Download at WoweBook.Com Preface On the Internet, popularity is swift and fleeting. A mention of your website on a popular blog can bring 300,000 potential customers your way at once, all expecting to find out who you are and what you have to offer. But if you’re a small company just starting out, your hardware and software aren’t likely to be able to handle that kind of traffic. Chances are, you’ve sensibly built your site to handle the 30,000 visits per hour you’re actually expecting in your first 6 months. Under heavy load, such a system would be incapable of showing even your company logo to the 270,000 others that showed up to look around. And those potential customers are not likely to come back after the traffic has subsided. The answer is not to spend time and money building a system to serve millions of visitors on the first day, when those same systems are only expected to serve mere thousands per day for the subsequent months. If you delay your launch to build big, you miss the opportunity to improve your product using feedback from your customers. Building big before allowing customers to use the product risks building something your cus- tomers don’t want. Small companies usually don’t have access to large systems of servers on day one. The best they can do is to build small and hope meltdowns don’t damage their reputation as they try to grow. The lucky ones find their audience, get another round of funding, and halt feature development to rebuild their product for larger capacity. The unlucky ones, well, don’t. But these days, there are other options. Large Internet companies such as Amazon.com, Google, and Microsoft are leasing parts of their high-capacity systems using a pay-per-use model. Your website is served from those large systems, which are plenty capable of handling sudden surges in traffic and ongoing success. And since you pay only for what you use, there is no up-front investment that goes to waste when traffic is low. As your customer base grows, the costs grow proportionally. xiii Download at WoweBook.Com Google App Engine, Google’s application hosting service, does more than just provide access to hardware. It provides a model for building applications that grow automati- cally. App Engine runs your application so that each user who accesses it gets the same experience as every other user, whether there are dozens of simultaneous users or thousands. The application uses the same large-scale services that power Google’s ap- plications for data storage and retrieval, caching, and network access. App Engine takes care of the tasks of large-scale computing, such as load balancing, data replication, and fault tolerance, automatically. The App Engine model really kicks in at the point where a traditional system would outgrow its first database server. With such a system, adding load-balanced web servers and caching layers can get you pretty far, but when your application needs to write data to more than one place, you have a hard problem. This problem is made harder when development up to that point has relied on features of database software that were never intended for data distributed across multiple machines. By thinking about your data in terms of App Engine’s model up front, you save yourself from having to rebuild the whole thing later, without much additional effort. Running on Google’s infrastructure means you never have to set up a server, replace a failed hard drive, or troubleshoot a network card. And you don’t have to be woken up in the middle of the night by a screaming pager because an ISP hiccup confused a service alarm. And with automatic scaling, you don’t have to scramble to set up new hardware as traffic increases. Google App Engine lets you focus on your application’s functionality and user expe- rience. You can launch early, enjoy the flood of attention, retain customers, and start improving your product with the help of your users. Your app grows with the size of your audience—up to Google-sized proportions—without having to rebuild for a new architecture. Meanwhile, your competitors are still putting out fires and configuring databases. With this book, you will learn how to develop applications that run on Google App Engine, and how to get the most out of the scalable model. A significant portion of the book discusses the App Engine scalable datastore, which does not behave like the re- lational databases that have been a staple of web development for the past decade. The application model and the datastore together represent a new way of thinking about web applications that, while being almost as simple as the model we’ve known, requires reconsidering a few principles we often take for granted. This book introduces the major features of App Engine, including the scalable services (such as for sending email and manipulating images), tools for deploying and managing applications, and features for integrating your application with Google Accounts and Google Apps using your own domain name. The book also discusses techniques for optimizing your application, using task queues and offline processes, and otherwise getting the most out of Google App Engine. xiv | Preface Download at WoweBook.Com Using This Book As of this writing, App Engine supports two technology stacks for building web applications: Java and Python. The Java technology stack lets you develop web appli- cations using the Java programming language (or most other languages that compile to Java bytecode or have a JVM-based interpreter) and Java web technologies such as servlets and JSPs. The Python technology stack provides a fast interpreter for the Python programming language, and is compatible with several major open source web appli- cation frameworks such as Django. This book covers concepts that apply to both technology stacks, as well as important language-specific subjects. If you’ve already decided which language you’re going to use, you probably won’t be interested in information that doesn’t apply to that lan- guage. This poses a challenge for a printed book: how should the text be organized so information about one technology doesn’t interfere with information about the other? Foremost, we’ve tried to organize the chapters by the major concepts that apply to all App Engine applications. Where necessary, chapters split into separate sections to talk about specifics for each language. In cases where an example in one language illustrates a concept equally well for other languages, the example is given in Python. If Python is not your language of choice, hopefully you’ll be able to glean the equivalent infor- mation from other parts of the book or from the official App Engine documentation on Google’s website. The datastore is a large enough subject that it gets multiple chapters to itself. Starting with Chapter 4, datastore concepts are introduced alongside Python and Java APIs related to those concepts. Note that we’ve taken an unconventional approach to in- troducing the datastore APIs by starting with the low-level APIs that map directly to datastore concepts. In your applications, you are most likely to prefer the higher level APIs of the data modeling interfaces. Data modeling is discussed separately, in Chap- ter 7 for Python, and in Chapter 8 for Java. Google may release additional technology stacks for other languages in the future. If they’ve done so by the time you read this, the concepts described here should still be relevant. Check this book’s website for information about future editions. This book has the following chapters: Chapter 1, Introducing Google App Engine A high-level overview of Google App Engine and its components, tools, and major features. This chapter also includes a brief discussion of features you might expect App Engine to have but that it doesn’t have yet. Chapter 2, Creating an Application An introductory tutorial for both Python and Java, including instructions on setting up a development environment, setting up accounts and domain names, and de- ploying the application to App Engine. The tutorial application demonstrates Preface | xv Download at WoweBook.Com the use of several App Engine features—Google Accounts, the datastore, and memcache—to implement a pattern common to many web applications: storing and retrieving user preferences. Chapter 3, Handling Web Requests Contains details about App Engine’s architecture, the various features of the frontend, app servers, and static file servers, and details about the app server run- time environments for Python and Java. The frontend routes requests to the app servers and the static file servers, and manages secure connections and Google Accounts authentication and authorization. This chapter also discusses quotas and limits, and how to raise them by setting a budget. Chapter 4, Datastore Entities The first of several chapters on the App Engine datastore, a strongly consistent scalable object data storage system with support for local transactions. This chapter introduces data entities, keys and properties, and Python and Java APIs for creat- ing, updating, and deleting entities. Chapter 5, Datastore Queries An introduction to datastore queries and indexes, and the Python and Java APIs for queries. The App Engine datastore’s query engine uses prebuilt indexes for all queries. This chapter describes the features of the query engine in detail, and how each feature uses indexes. The chapter also discusses how to define and manage indexes for your application’s queries. Chapter 6, Datastore Transactions How to use transactions to keep your data consistent. The App Engine datastore uses local transactions in a scalable environment. Your app arranges its entities in units of transactionality known as entity groups. This chapter attempts to provide a complete explanation of how the datastore updates data, and how to design your data and your app to best take advantage of these features. Chapter 7, Data Modeling with Python How to use the Python data modeling API to enforce invariants in your data schema. The datastore itself is schemaless, a fundamental aspect of its scalability. You can automate the enforcement of data schemas using App Engine’s data mod- eling interface. This chapter covers Python exclusively, though Java developers may wish to skim it for advice related to data modeling. Chapter 8, The Java Persistence API A brief introduction to the Java Persistence API (JPA), how its concepts translate to the datastore, how to use it to model data schemas, and how using it makes your application easier to port to other environments. JPA is a Java EE standard inter- face. App Engine also supports another standard interface known as Java Data Objects (JDO), though JDO is not covered in this book. This chapter covers Java exclusively. xvi | Preface Download at WoweBook.Com Chapter 9, The Memory Cache App Engine’s memory cache service (aka “memcache”), and its Python and Java APIs. Aggressive caching is essential for high-performance web applications. Chapter 10, Fetching URLs and Web Resources How to access other resources on the Internet via HTTP using the URL Fetch service. This chapter covers the Python and Java interfaces, including implemen- tations of standard URL fetching libraries. It also describes the asynchronous URL Fetch interface, which as of this writing is exclusive to Python. Chapter 11, Sending and Receiving Mail and Instant Messages How to use App Engine services to send email and instant messages to XMPP-compatible services (such as Google Talk). This chapter covers receiving email and XMPP chat messages relayed by App Engine using request handlers. It also discusses creating and processing messages using tools in the API. Chapter 12, Bulk Data Operations and Remote Access How to perform large maintenance operations on your live application using scripts running on your computer. Tools included with the SDK make it easy to back up, restore, load, and retrieve data in your app’s datastore. You can also write your own tools using the remote access API for data transformations and other jobs. You can also run an interactive Python command shell that uses the remote API to manipulate a live Python or Java app. Chapter 13, Task Queues and Scheduled Tasks How to perform work outside of user requests using task queues. Task queues perform tasks in parallel by running your code on multiple application servers. You control the processing rate with configuration. Tasks can also be executed on a regular schedule with no user interaction. Chapter 14, The Django Web Application Framework How to use the Django web application framework with the Python runtime en- vironment. This chapter discusses setting up a Django project, using the Django App Engine Helper, and taking advantage of features of Django via the Helper such as using the App Engine data modeling interface with forms and test fixtures. Chapter 15, Deploying and Managing Applications How to upload and run your app on App Engine, how to update and test an application using app versions, and how to manage and inspect the running ap- plication. This chapter also introduces other maintenance features of the Admin- istrator Console, including billing. We conclude with a list of places to go for help and further reading. Preface | xvii Download at WoweBook.Com Conventions Used in This Book The following typographical conventions are used in this book: Italic Indicates new terms, URLs, email addresses, filenames, and file extensions. Constant width Used for program listings, as well as within paragraphs to refer to program elements such as variable or function names, databases, data types, environment variables, statements, and keywords. Constant width bold Shows commands or other text that should be typed literally by the user. Constant width italic Shows text that should be replaced with user-supplied values or by values deter- mined by context. This icon signifies a tip, suggestion, or general note. Using Code Samples This book is here to help you get your job done. In general, you may use the code in this book in your programs and documentation. You do not need to contact us for permission unless you’re reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing a CD-ROM of examples from O’Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a significant amount of example code from this book into your product’s documentation does require permission. We appreciate, but do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example: “Programming Google App Engine by Dan Sanderson. Copyright 2010 Dan Sanderson, 978-0-596-52272-8.” If you feel your use of code examples falls outside fair use or the permission given above, feel free to contact us at permissions@oreilly.com. Safari® Books Online Safari Books Online is an on-demand digital library that lets you easily search over 7,500 technology and creative reference books and videos to find the answers you need quickly. xviii | Preface Download at WoweBook.Com With a subscription, you can read any page and watch any video from our library online. Read books on your cell phone and mobile devices. Access new titles before they are available for print, and get exclusive access to manuscripts in development and post feedback for the authors. Copy and paste code samples, organize your favorites, down- load chapters, bookmark key sections, create notes, print out pages, and benefit from tons of other time-saving features. O’Reilly Media has uploaded this book to the Safari Books Online service. To have full digital access to this book and others on similar topics from O’Reilly and other pub- lishers, sign up for free at http://my.safaribooksonline.com. How to Contact Us Please address comments and questions concerning this book to the publisher: O’Reilly Media, Inc. 1005 Gravenstein Highway North Sebastopol, CA 95472 800-998-9938 (in the United States or Canada) 707-829-0515 (international or local) 707-829-0104 (fax) We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at: http://oreilly.com/catalog/9780596522728 You can also download the examples from the author’s website: http://www.dansanderson.com/appengine To comment or ask technical questions about this book, send email to: bookquestions@oreilly.com For more information about our books, conferences, Resource Centers, and the O’Reilly Network, see our website at: http://www.oreilly.com Acknowledgments I owe a great deal of thanks to the App Engine team, of which I’ve been a proud member since 2008. This book would not exist without the efforts and leadership of Paul McDonald, Pete Koomen, and App Engine’s fearless tech lead, Kevin Gibbs. I am especially indebted to the App Engine datastore team, who have made significant contributions to the datastore chapters. Ryan Barrett, lead datastore engineer, provided many hours of conversation and detailed technical review. Max Ross, implementor of Preface | xix Download at WoweBook.Com the Java datastore interfaces and the JDO and JPA adapters, wrote major portions of Chapter 8. Rafe Kaplan, designer of the Python data modeling library, contributed portions of Chapter 7. My thanks to them. Thanks to Matthew Blain, Michael Davidson, Alex Gaysinsky, Peter McKenzie, Don Schwarz, and Jeffrey Scudder for reviewing portions of the book in detail. Thanks also to Andy Smith for making last-minute improvements to the Django Helper in time to be included here. Many other App Engine contributors had a hand, directly or indi- rectly, in making this book what it is: Freeland Abbott, Mike Aizatsky, Ken Ashcraft, Anthony Baxter, Chris Beckmann, Andrew Bowers, Matthew Brown, Ryan Brown, Hannah Chen, Lei Chen, Jason Cooper, Mark Dalrymple, Pavni Diwanji, Brad Fitzpatrick, Alfred Fuller, David Glazer, John Grabowski, Joe Gregorio, Raju Gulabani, Justin Haugh, Jeff Huber, Kevin Jin, Erik Johnson, Nick Johnson, Mickey Kataria, Scott Knaster, Marc Kriguer, Alon Levi, Sean Lynch, Gianni Mariani, Mano Marks, Jon McAlister, Sean McBride, Marzia Niccolai, Alan Noble, Brandon Nutter, Karsten Petersen, George Pirocanac, Alexander Power, Mike Repass, Toby Reyelts, Fred Sauer, Jens Scheffler, Robert Schuppenies, Lindsey Simon, John Skidgel, Brett Slatkin, Graham Spencer, Amanda Surya, David Symonds, Joseph Ternasky, Eric Tholomé, Troy Trimble, Guido van Rossum, Nicholas Verne, Michael Winton, and Wenbo Zhu. Thanks also to Dan Morrill, Mark Pilgrim, Steffi Wu, Karen Wickre, Jane Penner, Jon Murchinson, Tom Stocky, Vic Gundotra, Bill Coughran, and Alan Eustace. At O’Reilly, I’m eternally grateful to Michael Loukides, who had nothing but good advice and an astonishing amount of patience for a first-time author. Let’s do another one! xx | Preface Download at WoweBook.Com CHAPTER 1 Introducing Google App Engine Google App Engine is a web application hosting service. By “web application,” we mean an application or service accessed over the Web, usually with a web browser: storefronts with shopping carts, social networking sites, multiplayer games, mobile applications, survey applications, project management, collaboration, publishing, and all of the other things we’re discovering are good uses for the Web. App Engine can serve tradi- tional website content too, such as documents and images, but the environment is especially designed for real-time dynamic applications. In particular, Google App Engine is designed to host applications with many simulta- neous users. When an application can serve many simultaneous users without degrading performance, we say it scales. Applications written for App Engine scale automatically. As more people use the application, App Engine allocates more resour- ces for the application and manages the use of those resources. The application itself does not need to know anything about the resources it is using. Unlike traditional web hosting or self-managed servers, with Google App Engine, you only pay for the resources you use. These resources are measured down to the gigabyte, with no monthly fees or up-front charges. Billed resources include CPU usage, storage per month, incoming and outgoing bandwidth, and several resources specific to App Engine services. To help you get started, every developer gets a certain amount of re- sources for free, enough for small applications with low traffic. Google estimates that with the free resources, an app can accommodate about 5 million page views a month. App Engine can be described as three parts: the runtime environment, the datastore, and the scalable services. In this chapter, we’ll look at each of these parts at a high level. We’ll also discuss features of App Engine for deploying and managing web applications, and for building websites integrated with other Google offerings such as Google Apps and Google Accounts. 1 Download at WoweBook.Com The Runtime Environment An App Engine application responds to web requests. A web request begins when a client, typically a user’s web browser, contacts the application with an HTTP request, such as to fetch a web page at a URL. When App Engine receives the request, it identifies the application from the domain name of the address, either an .appspot.com subdo- main (provided for free with every app) or a subdomain of a custom domain name you have registered and set up with Google Apps. App Engine selects a server from many possible servers to handle the request, making its selection based on which server is most likely to provide a fast response. It then calls the application with the content of the HTTP request, receives the response data from the application, and returns the response to the client. From the application’s perspective, the runtime environment springs into existence when the request handler begins, and disappears when it ends. App Engine provides at least two methods for storing data that persists between requests (discussed later), but these mechanisms live outside of the runtime environment. By not retaining state in the runtime environment between requests—or at least, by not expecting that state will be retained between requests—App Engine can distribute traffic among as many servers as it needs to give every request the same treatment, regardless of how much traffic it is handling at one time. Application code cannot access the server on which it is running in the traditional sense. An application can read its own files from the filesystem, but it cannot write to files, and it cannot read files that belong to other applications. An application can see envi- ronment variables set by App Engine, but manipulations of these variables do not nec- essarily persist between requests. An application cannot access the networking facilities of the server hardware, though it can perform networking operations using services. In short, each request lives in its own “sandbox.” This allows App Engine to handle a request with the server that would, in its estimation, provide the fastest response. There is no way to guarantee that the same server hardware will handle two requests, even if the requests come from the same client and arrive relatively quickly. Sandboxing also allows App Engine to run multiple applications on the same server without the behavior of one application affecting another. In addition to limiting access to the operating system, the runtime environment also limits the amount of clock time, CPU use, and memory a single request can take. App Engine keeps these limits flexible, and applies stricter limits to applications that use up more resources to protect shared resources from “runaway” applications. A request has up to 30 seconds to return a response to the client. While that may seem like a comfortably large amount for a web app, App Engine is optimized for applications that respond in less than a second. Also, if an application uses many CPU cycles, App Engine may slow it down so the app isn’t hogging the processor on a machine serving multiple apps. A CPU-intensive request handler may take more clock time to complete 2 | Chapter 1: Introducing Google App Engine Download at WoweBook.Com than it would if it had exclusive use of the processor, and clock time may vary as App Engine detects patterns in CPU usage and allocates accordingly. Google App Engine provides two possible runtime environments for applications: a Java environment and a Python environment. The environment you choose depends on the language and related technologies you want to use for developing the application. The Java environment runs applications built for the Java 6 Virtual Machine (JVM). An app can be developed using the Java programming language, or most other lan- guages that compile to or otherwise run in the JVM, such as PHP (using Quercus), Ruby (using JRuby), JavaScript (using the Rhino interpreter), Scala, and Groovy. The app accesses the environment and services using interfaces based on web industry standards, including Java servlets and the Java Persistence API (JPA). Any Java tech- nology that functions within the sandbox restrictions can run on App Engine, making it suitable for many existing frameworks and libraries. Notably, App Engine fully sup- ports Google Web Toolkit (GWT), a framework for rich web applications that lets you write all of the app’s code—including the user interface—in the Java language, and have your rich graphical app work with all major browsers without plug-ins. The Python environment runs apps written in the Python 2.5 programming language, using a custom version of CPython, the official Python interpreter. App Engine invokes a Python app using CGI, a widely supported application interface standard. An appli- cation can use most of Python’s large and excellent standard library, as well as rich APIs and libraries for accessing services and modeling data. Many open source Python web application frameworks work with App Engine, such as Django, web2py, and Pylons, and App Engine even includes a simple framework of its own. The Java and Python environments use the same application server model: a request is routed to an app server, the application is started on the app server (if necessary) and invoked to handle the request to produce a response, and the response is returned to the client. Each environment runs its interpreter (the JVM or the Python interpreter) with sandbox restrictions, such that any attempt to use a feature of the language or a library that would require access outside of the sandbox fails with an exception. While using a different server for every request has advantages for scaling, it’s time- consuming to start up a new instance of the application for every request. App Engine mitigates startup costs by keeping the application in memory on an application server as long as possible and reusing servers intelligently. When a server needs to reclaim resources, it purges the least recently used app. All app servers have the runtime envi- ronment (JVM or Python interpreter) preloaded before the request reaches the server, so only the app itself needs to be loaded on a fresh server. Applications can exploit the app caching behavior to cache data directly on the app server using global (static) variables. Since an app can be evicted between any two requests (and low-traffic apps are evicted frequently), and there is no guarantee that a The Runtime Environment | 3 Download at WoweBook.Com given user’s requests will be handled by a given server, global variables are mostly useful for caching startup resources, like parsed configuration files. I haven’t said anything about which operating system or hardware configuration App Engine uses. There are ways to figure out what operating system or hardware a server is using, but in the end it doesn’t matter: the runtime environment is an abstraction above the operating system that allows App Engine to manage resource allocation, computation, request handling, scaling, and load distribution without the application’s involvement. Features that typically require knowledge of the operating system are either provided by services outside of the runtime environment, provided or emulated using standard library calls, or restricted in logical ways within the definition of the sandbox. The Static File Servers Most websites have resources they deliver to browsers that do not change during the regular operation of the site. The images and CSS files that describe the appearance of the site, the JavaScript code that runs in the browser, and HTML files for pages without dynamic components are examples of these resources, collectively known as static files. Since the delivery of these files doesn’t involve application code, it’s unnecessary and inefficient to serve them from the application servers. Instead, App Engine provides a separate set of servers dedicated to delivering static files. These servers are optimized for both internal architecture and network topology to handle requests for static resources. To the client, static files look like any other resource served by your app. You upload the static files of your application right alongside the application code. You can configure several aspects of how static files are served, including the URLs for static files, content types, and instructions for browsers to keep copies of the files in a cache for a given amount of time to reduce traffic and speed up rendering of the page. The Datastore Most useful web applications need to store information during the handling of a request for retrieval during a later request. A typical arrangement for a small website involves a single database server for the entire site, and one or more web servers that connect to the database to store or retrieve data. Using a single central database server makes it easy to have one canonical representation of the data, so multiple users accessing mul- tiple web servers all see the same and most recent information. But a central server is difficult to scale once it reaches its capacity for simultaneous connections. By far the most popular kind of data storage system for web applications in the past decade has been the relational database, with tables of rows and columns arranged for space efficiency and concision, and with indexes and raw computing power for 4 | Chapter 1: Introducing Google App Engine Download at WoweBook.Com performing queries, especially “join” queries that can treat multiple related records as a queryable unit. Other kinds of data storage systems include hierarchical datastores (filesystems, XML databases) and object databases. Each kind of database has pros and cons, and which type is best suited for an application depends on the nature of the application’s data and how it is accessed. And each kind of database has its own tech- niques for growing past the first server. Google App Engine’s database system most closely resembles an object database. It is not a join-query relational database, and if you come from the world of relational-database-backed web applications (as I did), this will probably require changing the way you think about your application’s data. As with the runtime envi- ronment, the design of the App Engine datastore is an abstraction that allows App Engine to handle the details of distributing and scaling the application, so your code can focus on other things. Entities and Properties An App Engine application stores its data as one or more datastore entities. An entity has one or more properties, each of which has a name, and a value that is of one of several primitive value types. Each entity is of a named kind, which categorizes the entity for the purpose of queries. At first glance, this seems similar to a relational database: entities of a kind are like rows in a table, and properties are like columns (fields). However, there are two major dif- ferences between entities and rows. First, an entity of a given kind is not required to have the same properties as other entities of the same kind. Second, an entity can have a property of the same name as another entity has, but with a different type of value. In this way, datastore entities are “schemaless.” As you’ll soon see, this design provides both powerful flexibility as well as some maintenance challenges. Another difference between an entity and a table row is that an entity can have multiple values for a single property. This feature is a bit quirky, but can be quite useful once understood. Every datastore entity has a unique key that is either provided by the application or generated by App Engine (your choice). Unlike a relational database, the key is not a “field” or property, but an independent aspect of the entity. You can fetch an entity quickly if you know its key, and you can perform queries on key values. A entity’s key cannot be changed after the entity has been created. Neither can its kind. App Engine uses the entity’s kind and key to help determine where the entity is stored in a large collection of servers—though neither the key nor the kind ensure that two entities are stored on the same server. The Datastore | 5 Download at WoweBook.Com Queries and Indexes A datastore query returns zero or more entities of a single kind. It can also return just the keys of entities that would be returned for a query. A query can filter based on conditions that must be met by the values of an entity’s properties, and can return entities ordered by property values. A query can also filter and sort using keys. In a typical relational database, queries are planned and executed in real time against the data tables, which are stored as they were designed by the developer. The developer can also tell the database to produce and maintain indexes on certain columns to speed up certain queries. App Engine does something dramatically different. With App Engine, every query has a corresponding index maintained by the datastore. When the application performs a query, the datastore finds the index for that query, scans down to the first row that matches the query, then returns the entity for each consecutive row in the index until the first row that doesn’t match the query. Of course, this requires that App Engine know ahead of time which queries the appli- cation is going to perform. It doesn’t need to know the values of the filters in advance, but it does need to know the kind of entity to query, the properties being filtered or sorted, and the operators of the filters and the orders of the sorts. App Engine provides a set of indexes for simple queries by default, based on which properties exist on entities of a kind. For more complex queries, an app must include index specifications in its configuration. The App Engine SDK helps produce this con- figuration file by watching which queries are performed as you test your application with the provided development web server on your computer. When you upload your app, the datastore knows to make indexes for every query the app performed during testing. You can also edit the index configuration manually. When your application creates new entities and updates existing ones, the datastore updates every corresponding index. This makes queries very fast (each query is a simple table scan) at the expense of entity updates (possibly many tables may need updating for a single change). In fact, the performance of an index-backed query is not affected by the number of entities in the datastore, only the size of the result set. It’s worth paying attention to indexes, as they take up space and increase the time it takes to update entities. We discuss indexes in detail in Chapter 5. Transactions When an application has many clients attempting to read or write the same data si- multaneously, it is imperative that the data always be in a consistent state. One user should never see half-written data or data that doesn’t make sense because another user’s action hasn’t completed. 6 | Chapter 1: Introducing Google App Engine Download at WoweBook.Com When an application updates the properties of a single entity, App Engine ensures that either every update to the entity succeeds all at once, or the entire update fails and the entity remains the way it was prior to the beginning of the update. Other users do not see any effects of the change until the change succeeds. In other words, an update of a single entity occurs in a transaction. Each transaction is atomic: the transaction either succeeds completely or fails completely, and cannot suc- ceed or fail in smaller pieces. An application can read or update multiple entities in a single transaction, but it must tell App Engine which entities will be updated together when it creates the entities. The application does this by creating entities in entity groups. App Engine uses entity groups to control how entities are distributed across servers, so it can guarantee a transaction on a group succeeds or fails completely. In database terms, the App Engine datastore natively supports local transactions. When an application calls the datastore API to update an entity, control does not return to the application until the transaction succeeds or fails, and the call returns with knowledge of success or failure. For updates, this means the application waits for all entities and indexes to be updated before doing anything else. If a user tries to update an entity while another user’s update of the entity is in progress, the datastore returns immediately with a concurrency failure exception. It is often ap- propriate for the app to retry a bounced transaction several times before declaring the condition an error, usually retrieving data that may have changed within the transaction before calculating new values and updating it. In database terms, App Engine uses optimistic concurrency control. Reading the entity never fails due to concurrency; the application just sees the entity in its most recent stable state. You can also perform multiple reads in a transaction to ensure that all of the data read in the transaction is current and consistent with itself. In most cases, retrying a transaction on a contested entity will succeed. But if an application is designed such that many users might update a single entity, the more popular the application gets, the more likely users will get concurrency failures. It is important to design entity groups to avoid concurrency failures even with a large num- ber of users. An application can bundle multiple datastore operations in a single transaction. For example, the application can start a transaction, read an entity, update a property value based on the last read value, save the entity, then commit the transaction. In this case, the save action does not occur unless the entire transaction succeeds without conflict with another transaction. If there is a conflict and the app wants to try again, the app should retry the entire transaction: read the (possibly updated) entity again, use the new value for the calculation, and attempt the update again. The Datastore | 7 Download at WoweBook.Com With indexes and optimistic concurrency control, the App Engine datastore is designed for applications that need to read data quickly, ensure that the data it sees is in a con- sistent form, and scale the number of users and the size of the data automatically. While these goals are somewhat different from those of a relational database, they are espe- cially well suited to web applications. The Services The datastore’s relationship with the runtime environment is that of a service: the ap- plication uses an API to access a separate system that manages all of its own scaling needs separately from the runtime environment. Google App Engine includes several other self-scaling services useful for web applications. The memory cache (or memcache) service is a short-term key-value storage service. Its main advantage over the datastore is that it is fast, much faster than the datastore for simple storage and retrieval. The memcache stores values in memory instead of on disk for faster access. It is distributed like the datastore, so every request sees the same set of keys and values. However, it is not persistent like the datastore: if a server goes down, such as during a power failure, memory is erased. It also has a more limited sense of atomicity and transactionality than the datastore. As the name implies, the memcache service is best used as a cache for the results of frequently performed queries or calcu- lations. The application checks for a cached value, and if the value isn’t there, it per- forms the query or calculation and stores the value in the cache for future use. App Engine applications can access other web resources using the URL Fetch service. The service makes HTTP requests to other servers on the Internet, such as to retrieve pages or interact with web services. Since remote servers can be slow to respond, the URL Fetch API supports fetching URLs in the background while a request handler does other things, but in all cases the fetch must start and finish within the request handler’s lifetime. The application can also set a deadline, after which the call is canceled if the remote host hasn’t responded. App Engine applications can send messages using the Mail service. Messages can be sent on behalf of the application or on behalf of the user who made the request that is sending the email (if the message is from the user). Many web applications use email to notify users, confirm user actions, and validate contact information. An application can also receive email messages. If an app is configured to receive email, a message sent to the app’s address is routed to the Mail service, which delivers the message to the app in the form of an HTTP request to a request handler. App Engine applications can send and receive instant messages to and from chat serv- ices that support the XMPP protocol, including Google Talk. An app sends an XMPP chat message by calling the XMPP service. As with incoming email, when someone sends a message to the app’s address, the XMPP service delivers it to the app by calling a request handler. 8 | Chapter 1: Introducing Google App Engine Download at WoweBook.Com The image processing service can do lightweight transformations of image data, such as for making thumbnail images of uploaded photos. The image processing tasks are performed using the same infrastructure Google uses to process images with some of its other products, so the results come back quickly. We won’t be covering the image service API in this book because Google’s official documentation says everything there is to say about this easy-to-use service. Google Accounts App Engine features integration with Google Accounts, the user account system used by Google applications such as Google Mail, Google Docs, and Google Calendar. You can use Google Accounts as your app’s account system, so you don’t have to build your own. And if your users already have Google accounts, they can sign in to your app using their existing accounts, with no need to create new accounts just for your app. Of course, there is no obligation to use Google Accounts. You can always build your own account system, or use an OpenID provider. Google Accounts is especially useful for developing applications for your company or organization using Google Apps. With Google Apps, your organization’s members can use the same account to access your custom applications as well as their email, calendar, and documents. Task Queues and Cron Jobs A web application has to respond to web requests very quickly, usually in less than a second and preferably in just a few dozen milliseconds, to provide a smooth experience to the user sitting in front of the browser. This doesn’t give the application much time to do work. Sometimes, there is more work to do than there is time to do it. In such cases it’s usually OK if the work gets done within a few seconds, minutes, or hours, instead of right away, as the user is waiting for a response from the server. But the user needs a guarantee that the work will get done. For this kind of work, App Engine uses task queues. Task queues let request handlers describe work to be done at a later time, outside the scope of the web request. Queues ensure that every task gets done eventually. If a task fails, the queue retries the task until it succeeds. You can configure the rate at which queues are processed to spread the workload throughout the day. A queue performs a task by calling a request handler. It can include a data payload provided by the code that created the task, delivered to the task’s handler as an HTTP request. The task’s handler is subject to the same limits as other request handlers, including the 30-second time limit. An especially powerful feature of task queues is the ability to enqueue a task within a datastore transaction. This ensures that the task will be enqueued only if the rest of the Task Queues and Cron Jobs | 9 Download at WoweBook.Com datastore transaction succeeds. You can use transactional tasks to perform additional datastore operations that must be consistent with the transaction eventually, but that do not need the strong consistency guarantees of the datastore’s local transactions. App Engine has another service for executing tasks at specific times of the day. Sched- uled tasks are also known as “cron jobs,” a name borrowed from a similar feature of the Unix operating system. The scheduled tasks service can invoke a request handler at a specified time of the day, week, or month, based on a schedule you provide when you upload your application. Scheduled tasks are useful for doing regular maintenance or sending periodic notification messages. We’ll look at these features and some powerful uses for them in Chapter 13. Developer Tools Google provides free tools for developing App Engine applications in Java or Python. You can download the software development kit (SDK) for your chosen language and your computer’s operating system from Google’s website. Java users can get the Java SDK in the form of a plug-in for the Eclipse integrated development environment. Py- thon users using Windows or Mac OS X can get the Python SDK in the form of a GUI application. Both SDKs are also available as ZIP archives of command-line tools, for using directly or integrating into your development environment or build system. Each SDK includes a development web server that runs your application on your local computer and simulates the runtime environment, the datastore, and the services. The development server automatically detects changes in your source files and reloads them as needed, so you can keep the server running while you develop the application. If you’re using Eclipse, you can run the Java development server in the interactive de- bugger, and can set breakpoints in your application code. You can also use Eclipse for Python app development using PyDev, an Eclipse extension that includes an interactive Python debugger. (Using PyDev is not covered in this book, but there are instructions on Google’s site.) The development version of the datastore can automatically generate configuration for query indexes as the application performs queries, which App Engine will use to pre- build indexes for those queries. You can turn this feature off for testing whether queries have appropriate indexes in the configuration. The development web server includes a built-in web application for inspecting the contents of the (simulated) datastore. You can also create new datastore entities using this interface for testing purposes. Each SDK also includes a tool for interacting with the application running on App Engine. Primarily, you use this tool to upload your application code to App Engine. You can also use this tool to download log data from your live application, or manage the live application’s indexes. 10 | Chapter 1: Introducing Google App Engine Download at WoweBook.Com The Python and Java SDKs include a feature you can install in your app for secure remote programmatic access to your live application. The Python SDK includes tools that use this feature for bulk data operations, such as uploading new data from a text file and downloading large amounts of data for backup or migration purposes. The SDK also includes a Python interactive command-line shell for testing, debugging, and manually manipulating live data. (These tools are in the Python SDK, but also work with Java apps using the Java version of the remote access feature.) You can write your own scripts and programs that use the remote access feature for large-scale data trans- formations or other maintenance. The Administration Console When your application is ready for its public debut, you create an administrator ac- count and set up the application on App Engine. You use your administrator account to create and manage the application, view its access and resource usage statistics and message logs, and more, all with a web-based interface called the Administration Console. You sign in to the Administration Console using your Google account. You can use your current Google account if you have one, though you may also want to create a Google account just for your application, which you might use as the “from” address on email messages. Once you have created an application using the Administration Console, you can add additional Google accounts as administrators. Any administrator can access the Console, and can upload a new version of the application. The Console gives you access to real-time performance data about how your application is being used, as well as access to log data emitted by your application. You can also query the datastore for the live application using a web interface, and check on the status of datastore indexes. (Newly created indexes with large data sets take time to build.) When you upload new code for your application using the SDK, the uploaded version is assigned a version identifier, which you specify in the application’s configuration file. The version used for the live application is whichever major version is selected as the “default.” You control which version is the “default” using the Administration Console. You can access nondefault versions using a special URL containing the version identi- fier. This allows you to test a new version of an app running on App Engine before making it official. You use the Console to set up and manage the billing account for your application. When you’re ready for your application to consume more resources beyond the free amounts, you set up a billing account using a credit card and Google Accounts. The owner of the billing account sets a budget, a maximum amount of money that can be charged per calendar day. Within that budget, you can allocate how much additional The Administration Console | 11 Download at WoweBook.Com CPU time, bandwidth, storage, and email recipients the app can consume. You are only charged for what the application actually uses beyond the free amounts. Things App Engine Doesn’t Do...Yet When people first start using App Engine, there are several things they ask about that App Engine doesn’t do. Some of these are things Google may implement in the near future, and others run against the grain of the App Engine design and aren’t likely to be added. Listing such features in a book is difficult, because by the time you read this, Google may have already implemented them. But it’s worth noting these features here, especially to note workaround techniques. App Engine supports secure connections (HTTPS) to .appspot.com subdomains, but does not yet support secure connections to custom domains. Google Accounts sign- ins always use secure connections. An application can use the URL Fetch service to make an HTTPS request to another site, but App Engine does not verify the certificate used on the remote server. An app can receive incoming email and XMPP chat messages at several addresses. As of this writing, none of these addresses can use a custom domain name. See Chap- ter 11 for information on incoming email and XMPP addresses. An app can accept web requests on a custom domain using Google Apps. Google Apps maps a subdomain of your custom domain to an app, and this subdomain can be www if you choose. This does not yet support requests for “naked” domains, such as http:// example.com/. It also does not support arbitrary tertiary domains on custom domains (http://foo.www.example.com). App Engine does support arbitrary subdomains on appspot.com URLs, such as foo.app-id.appspot.com. App Engine does not host long-running background processes. Task queues and sched- uled tasks can invoke request handlers outside of a user request, and can drive some kinds of batch processing. But processing large chores in small batches is different in character and range from full-scale distributed computing tasks. We will discuss batch processing later in Chapter 12. App Engine does not support streaming or long-term connections. If the client supports it, the app can use XMPP and an XMPP service (such as Google Talk) to deliver state updates to the client. You could also do this using a polling technique, where the client asks the application for updates on a regular basis, but polling is difficult to scale (5,000 simultaneous users polling every 5 seconds = 1,000 queries per second), and is not appropriate for all applications. Also note that request handlers cannot communicate with the client while performing other calculations. The server sends a response to the client’s request only after the handler has returned control to the server. 12 | Chapter 1: Introducing Google App Engine Download at WoweBook.Com App Engine only supports web requests via HTTP or HTTPS, and email and XMPP messages via the services. It does not support other kinds of network connections. For instance, a client cannot connect to an App Engine application via FTP. The App Engine datastore does not support full-text search queries, such as for imple- menting a search engine for a content management system. Long text values are not indexed, and short text values are only indexed for equality and inequality queries. It is possible to implement text search by building search indexes within the application, but this is difficult to do in a scalable way for large amounts of dynamic data. Getting Started You can start developing applications for Google App Engine without creating an ac- count. All you need to get started is the App Engine SDK appropriate for your choice of language, which is a free download from the App Engine website: http://code.google.com/appengine/ While you’re there, check out the official “Getting Started Guide” for your language, which demonstrates how to create an application and use several of App Engine’s features. In the next chapter, we’ll describe how to create a new project from start to finish, including how to create an account, upload the application, and run it on App Engine. Getting Started | 13 Download at WoweBook.Com Download at WoweBook.Com CHAPTER 2 Creating an Application The App Engine development model is as simple as it gets: 1. Create the application. 2. Test the application on your own computer using the web server software included with the App Engine development kit. 3. Upload the finished application to App Engine. In this chapter, we will walk through the process of creating a new application, testing it with the development server, registering a new application ID and setting up a domain name, and uploading the app to App Engine. We will look at some of the features of the Python and Java software development kits (SDKs) and the App Engine Adminis- tration Console. We’ll also discuss the workflow for developing and deploying an app. We will take this opportunity to demonstrate a common pattern in web applications: managing user preferences data. This pattern uses several App Engine services and features. Setting Up the SDK All the tools and libraries you need to develop an application are included in the App Engine SDK. There are separate SDKs for Python and Java, each with features useful for developing with each language. The SDKs work on any platform, including Win- dows, Mac OS X, and Linux. The Python and Java SDKs each include a web server that runs your app in a simulated runtime environment on your computer. The development server enforces the sandbox restrictions of the full runtime environment and simulates each of the App Engine services. You can start the development server and leave it running while you build your app, reloading pages in your browser to see your changes in effect. Both SDKs include a multifunction tool for interacting with the app running on App Engine. You use this tool to upload your app’s code, static files, and configuration. 15 Download at WoweBook.Com The tool can also manage datastore indexes, task queues, and scheduled tasks, and can download messages logged by the live application so you can analyze your app’s traffic and behavior. Because Google launched Python support before Java, the Python SDK has a few tools not available in the Java SDK. Most notably, the Python SDK includes tools for up- loading and downloading data to and from the datastore. This is useful for making backups, changing the structure of existing data, and for processing data offline. If you are using Java, you can use the Python-based data tools with a bit of effort. The Python SDKs for Windows and Mac OS X include a “launcher” application that makes it especially easy to create, edit, test, and upload an app using a simple graphical interface. Paired with a good programming text editor (such as Notepad++ for Win- dows, or TextMate for Mac OS X), the launcher provides a fast and intuitive Python development experience. For Java developers, Google provides a plug-in for the Eclipse integrated development environment that implements a complete App Engine development workflow. The plug-in includes a template for creating new App Engine Java apps, as well as a debug- ging profile for running the app and the development web server in the Eclipse debug- ger. To deploy a project to App Engine, you just click a button on the Eclipse toolbar. Both SDKs also include cross-platform command-line tools that provide these features. You can use these tools from a command prompt, or otherwise integrate them into your development environment as you see fit. We’ll discuss the Python SDK first, then the Java SDK in “Installing the Java SDK” on page 20. Feel free to skip the section that does not apply to your chosen language. Installing the Python SDK The App Engine SDK for the Python runtime environment runs on any computer that runs Python 2.5. If you are using Mac OS X or Linux, or if you have used Python previously, you may already have Python on your system. You can test whether Python is installed on your system and check which version is installed by running the following command at a command prompt (in Windows, Command Prompt; in Mac OS X, Terminal): python -V (That’s a capital “V.”) If Python is installed, it prints its version number, like so: Python 2.5.2 You can download and install Python 2.5 for your platform from the Python website: http://www.python.org/ 16 | Chapter 2: Creating an Application Download at WoweBook.Com Be sure to get Python version 2.5 (such as 2.5.4) from the “Download” section of the site. As of this writing, the latest major version of Python is 3.1, and the latest 2.x- compatible release is 2.6. The App Engine Python SDK works with Python 2.6, but it’s better to use the same version of Python that’s used on App Engine for development so you are not surprised by obscure compatibility issues. App Engine Python does not yet support Python 3. Python 3 includes several new language and library features that are not backward com- patible with earlier versions. When App Engine adds support for Python 3, it will likely be in the form of a new runtime environment, in addition to the Python 2 environment. You control which runtime environment your application uses with a setting in the app’s configuration file, so your application will continue to run as intended when new runtime environments are released. You can download the App Engine Python SDK bundle for your operating system from the Google App Engine website: http://code.google.com/appengine/downloads.html Download and install the file appropriate for your operating system: • For Windows, the Python SDK is an .msi (Microsoft Installer) file. Click on the appropriate link to download it, then double-click on the file to start the installation process. This installs the Google App Engine Launcher application, adds an icon to your Start menu, and adds the command-line tools to the command path. • For Mac OS X, the Python SDK is a Mac application in a .dmg (disk image) file. Click on the link to download it, then double-click on the file to mount the disk image. Drag the GoogleAppEngineLauncher icon to your Applications folder. To install the command-line tools, double-click the icon to start the Launcher, then allow the Launcher to create the “symlinks” when prompted. • If you are using Linux or another platform, the Python SDK is available as a .zip archive. Download and unpack it (typically with the the unzip command) to create a directory named google_appengine. The command-line tools all reside in this directory. Adjust your command path as needed. To test that the App Engine Python SDK is installed, run the following command at a command prompt: dev_appserver.py --help The command prints a helpful message and exits. If instead you see a message about the command not being found, check that the installer completed successfully, and that the location of the dev_appserver.py command is on your command path. Windows users, if when you run this command a dialog box opens with the message “Windows cannot open this file... To open this file, Windows needs to know what Setting Up the SDK | 17 Download at WoweBook.Com program created it,” you must tell Windows to use Python to open the file. In the dialog box, choose “Select the program from a list,” and click OK. Click Browse, then locate your Python installation (such as C:\Python25). Select python from this folder, then click Open. Select “Always use the selected program to open this kind of file.” Click OK. A window will open and attempt to run the command, then immediately close. You can now run the command from the Command Prompt. A brief tour of the Launcher The Windows and Mac OS X versions of the Python SDK include an application called the Google App Engine Launcher (hereafter just “Launcher”). With the Launcher, you can create and manage multiple App Engine Python projects using a graphical interface. Figure 2-1 shows an example of the Launcher window in Mac OS X. Figure 2-1. The Google App Engine Launcher for Mac OS X main window, with a project selected To create a new project, select New Project... from the File menu (or click the plus-sign button at the bottom of the window). Browse to where you want to keep your project files, then enter a name for the project. The Launcher creates a new directory at that location, named after the project, to hold the project’s files, and creates several starter files. The project appears in the project list in the main launcher window. 18 | Chapter 2: Creating an Application Download at WoweBook.Com To start the development web server, make sure the project is selected, then click the Run button. You can stop the server with the Stop button. To open the home page of the running app in a browser, click the Browse button. The Logs button displays mes- sages logged by the app in the development server. The SDK Console button opens a web interface for the development server with several features for inspecting the running application, including tools to inspect the contents of the (simulated) datastore and memory cache, and an interactive console that executes Python statements and displays the results. The Edit button opens the project’s files in your default text editor. In the Mac OS X version, this is especially useful with text editors that can open a directory’s worth of files, such as TextMate or Emacs. In the Windows version, this just opens app.yaml for editing. The Deploy button uploads the project to App Engine. Before you can deploy a project, you must register an application ID with App Engine and edit the application’s con- figuration file with the registered ID. The Dashboard button opens a browser window with the App Engine Administration Console for the deployed app. We’ll look at the configuration file, the registration process, and the Administration Console later in this chapter. The complete App Engine Python SDK, including the command-line tools, resides in the Launcher’s application directory. In the Windows version, the installer adds the appropriate directory to the command path, so you can run these tools from a Com- mand Prompt. In Mac OS X, when you start the Launcher for the first time it asks for permission to create “symlinks.” This creates symbolic links in the directory /usr/local/bin/ that refer to the command-line tools in the application bundle. With the links in this directory, you can type just the name of a command at a Terminal prompt to run it. If you didn’t create the symlinks, you can do so later by selecting the Make Symlinks... item from the GoogleAppEngineLauncher menu. You can set command-line flags for the development server within the Launcher. To do so, select the application, then go to the Edit menu and select Application Settings.... Add the desired command-line options to the “Extra Flags” field, then click Update. The Mac OS X version of the Launcher installs Google’s software update facility to check for new versions of the App Engine SDK. When a new version is released, this feature notifies you and offers to upgrade. Immediately after you upgrade, you’ll notice the symlinks stop working. To fix the symlinks, reopen the Launcher app and follow the prompts. The upgrade can’t do this automatically because it needs your permis- sion to create new symlinks. Setting Up the SDK | 19 Download at WoweBook.Com Installing the Java SDK The App Engine SDK for the Java runtime environment runs on any computer that runs the Java SE Development Kit (JDK). The App Engine for Java SDK supports JDK 5 and JDK 6. When running on App Engine, the Java runtime environment uses the Java 6 JVM. If you don’t already have it, you can download and install the Java 6 JDK for most platforms from Sun’s website. (Mac users, see the next section.) http://java.sun.com/javase/downloads/index.jsp You can test whether the Java development kit is installed on your system and check which version it is by running the following command at a command prompt (in Win- dows, Command Prompt; in Mac OS X, Terminal): javac -version If you have the Java 6 JDK installed, the command will print a version number similar to javac 1.6.0. If you have the Java 5 JDK installed, the command will print a version number similar to javac 1.5.0. The actual output varies depending on which specific version you have. App Engine Java apps use interfaces and features from Java Enterprise Edition (Java EE). The App Engine SDK includes implementations for the relevant Java EE features. You do not need to install a separate Java EE implementation. The steps for installing the App Engine SDK for Java depend on whether you wish to use the Google Plugin for the Eclipse IDE. We’ll cover these situations separately. Java on Mac OS X If you are using Mac OS X, you already have Java and the JDK installed. How you’d use it depends on the version of the operating system, and whether your computer has a 32-bit processor (such as the Intel Core Duo) or a 64-bit processor (Intel Core 2 Duo, Intel Xeon). You can check which processor you have by selecting the Apple menu, About This Mac. Mac OS X 10.6 Snow Leopard includes Java 6 and its JDK, and it includes separate versions for 32-bit processors and for 64-bit processors. If you have a 64-bit processor, the 64-bit Java 6 is the default. If you have a 32-bit processor, the 32-bit Java 6 is the default. Mac OS X 10.5 Leopard includes both Java 5 and Java 6. However, in Leopard, Java 5 is the default. This is because Leopard’s version of Java 6 only works with 64-bit pro- cessors. If you have a 64-bit processor, you can change the default version to the 64- bit Java 6. 20 | Chapter 2: Creating an Application Download at WoweBook.Com To change the version of Java used by the system, open the Java Preferences utility, which you can find under /Applications/Utilities/. In the “Java Applications” list, drag the desired version (such as “Java SE 6, 64-bit”) to the top of the list. OS X uses the topmost version in the list that is compatible with your system. If you have a 32-bit Mac running Leopard, you’re stuck using Java 5. The App Engine SDK works fine under Java 5, and apps built with Java 5 run fine on App Engine. Just be aware that you’re using Java 5, and code samples you might find for App Engine may assume Java 6. If you are using Eclipse, make sure you get the version that corresponds with your processor and selected version of Java. Separate versions of the “Eclipse IDE for Java EE Developers” bundle are available for 32-bit and 64-bit processors. For more information about Java and Mac OS X, see Apple’s developer website: http://developer.apple.com/java/ Installing the Java SDK with the Google Plugin for Eclipse One of the easiest ways to develop App Engine applications in Java is to use the Eclipse IDE and the Google Plugin for Eclipse. The plug-in works with Eclipse 3.3 (Europa), Eclipse 3.4 (Ganymede), and Eclipse 3.5 (Galileo). You can get Eclipse for your platform for free at the Eclipse website: http://www.eclipse.org/ If you’re getting Eclipse specifically for App Engine development, get the “Eclipse IDE for Java EE Developers” bundle. This bundle includes several useful components for developing web applications, including the Eclipse Web Tools Platform (WTP) package. You can tell Eclipse to use the JDK you have installed in the Preferences window. In Eclipse 3.5, select Preferences (Windows and Linux, in the Window menu; Mac OS X, in the Eclipse menu). In the Java category, select “Installed JREs.” If necessary, add the location of the SDK to the list, and make sure the checkbox is checked. To install the App Engine Java SDK and the Google Plugin, use the software installation feature of Eclipse. In Eclipse 3.5, select Install New Software... from the Help menu, then type the following URL in the “Work with” field and click the Add... button: http://dl.google.com/eclipse/plugin/3.5 (This URL does not work in a browser; it only works with the Eclipse software installer.) In the dialog window that opens, enter “Google” for the name, then click OK. Two items are added to the list, one for the plug-in (“Plugin”) and a set for the App Engine and Google Web Toolkit SDKs (“SDKs”). Figure 2-2 shows the Install Software window with the appropriate items selected. Setting Up the SDK | 21 Download at WoweBook.Com Figure 2-2. The Eclipse 3.5 (Galileo) Install Software window, with the Google Plugin selected Check the boxes for these two items. Click the Next > button and follow the prompts. For more information on installing the Google Plugin for Eclipse, including instructions for Eclipse 3.3 or 3.4, see the website for the plug-in: http://code.google.com/eclipse/ After installation, the Eclipse toolbar has three new buttons, as shown in Figure 2-3. Figure 2-3. The Eclipse 3.5 toolbar with the Google Plugin installed, with three new buttons: New Web Application Project, GWT Compile Project, and Deploy App Engine Project The plug-in adds several features to the Eclipse interface: • The three buttons in the toolbar: New Web Application Project, GWT Compile Project, and Deploy App Engine Project • A Web Application Project item under New in the File menu 22 | Chapter 2: Creating an Application Download at WoweBook.Com • A Web Application debug profile, for running an app in the development web server under the Eclipse debugger You can use Eclipse to develop your application, and to deploy it to App Engine. To use other features of the SDK, like downloading log data, you must use the command- line tools from the App Engine SDK. Eclipse installs the SDK in your Eclipse application directory, under eclipse/plugins/. The actual directory name depends on the specific version of the SDK installed, but it looks something like this: com.google.appengine.eclipse.sdkbundle_1.2.5.v200909021031/appengine-java-sdk-1.2.5/ This directory contains command-line tools in a subdirectory named bin/. In Mac OS X or Linux, you may need to change the permissions of these files to be executable in order to use the tools from the command line: chmod 755 bin/* You can add the bin/ directory to your command path, but keep in mind that the path will change each time you update the SDK. Installing the Java SDK without Eclipse If you are not using the Eclipse IDE or otherwise don’t wish to use the Google Plugin, you can download the App Engine Java SDK as a .zip archive from the App Engine website: http://code.google.com/appengine/downloads.html The archive unpacks to a directory with a name such as appengine-java-sdk-1.2.5. The SDK contains command-line launch scripts in the bin/ subdirectory. You can add this directory to your command path to make the commands easier to run. Both the AppCfg tool and the development web server execute Java classes to perform their functions. You can integrate these tools into your IDE or build scripts by calling the launch scripts, or by calling the Java classes directly. Look at the contents of the launch scripts to see the syntax. The App Engine SDK includes a plug-in for Apache Ant that lets you perform functions of the SDK from an Ant build script. See the App Engine documentation for more information about using Ant with App Engine. Test that the App Engine Java SDK is installed properly by running the following com- mand at a command prompt: dev_appserver --help Mac OS X and Linux users, use dev_appserver.sh as the command name. Setting Up the SDK | 23 Download at WoweBook.Com The command prints a helpful message and exits. If instead you see a message about the command not being found, check that the archive unpacked successfully, and that the SDK’s bin/ directory is on your command path. Developing the Application An App Engine application responds to web requests. It does so by calling request handlers, routines that accept request parameters and return responses. App Engine determines which request handler to use for a given request from the request’s URL, using a configuration file included with the app that maps URLs to handlers. An app can also include static files, such as images, CSS stylesheets, and browser JavaScript. App Engine serves these files directly to clients in response to requests for corresponding URLs without invoking any code. The app’s configuration specifies which of its files are static, and which URLs to use for those files. The application configuration includes metadata about the app, such as its application ID and version number. When you deploy the app to App Engine, all of the app’s files, including the code, configuration files, and static files, are uploaded and associated with the application ID and version number mentioned in the configuration. An app can also have configuration files specific to the services, such as for datastore indexes, task queues, and scheduled tasks. These files are associated with the app in general, not a specific version of the app. The structure and format of the code and configuration files differ for Python apps and for Java apps, but the concepts are similar. In the next few sections, we will create the files needed for a simple application in both Python and Java, and will look at how to use the tools and libraries included with each SDK. The User Preferences Pattern The application we will create in this section is a simple clock. When a user visits the site, the app displays the current time of day according to the server’s system clock. By default, the app shows the current time in the Coordinated Universal Time (UTC) time zone. The user can customize the time zone by signing in using Google Accounts and setting a preference. This app demonstrates three App Engine features: • The datastore, primary storage for user settings data that is persistent, reliable, and scalable • The memory cache (or memcache), secondary storage that is faster than the data- store, but is not necessarily persistent in the long term • Google Accounts, the ability to use Google’s user account system for authenticat- ing and identifying users 24 | Chapter 2: Creating an Application Download at WoweBook.Com Google Accounts works similarly to most user account systems. If the user is not signed in to the clock application, she sees a generic view with default settings (the UTC time zone) and a link to sign in or create a new account. If the user chooses to sign in or register, the application directs her to a sign-in form managed by Google Accounts. Signing in or creating an account redirects the user back to the application. Of course, you can implement your own account mechanism instead of using Google Accounts. Using Google Accounts has advantages and disadvantages—the chief ad- vantage being that you don’t have to implement your own account mechanism. If a user of your app already has a Google account, the user can sign in with that account without creating a new account for your app. If the user accesses the application while signed in, the app loads the user’s preferences data and uses it to render the page. The app retrieves the preferences data in two steps. First, it attempts to get the data from the fast secondary storage, the memory cache. If the data is not present in the memory cache, the app attempts to retrieve it from the primary storage (the datastore), and if successful, it puts it into the memory cache to be found by future requests. This means that for most requests, the application can get the user’s preferences from the memcache without accessing the datastore. While reading from the datastore is reasonably fast, reading from the memcache is much faster. The difference is substantial when the same data must be accessed every time the user visits a page. Our clock application has two request handlers. One handler displays the current time of day, along with links for signing in and out. It also displays a web form for adjusting the time zone when the user is signed in. The second request handler processes the time zone form when it is submitted. When the user submits the preferences form, the app saves the changes and redirects the browser back to the main page. The application gets the current time from the application server’s system clock. It’s worth noting that App Engine makes no guarantees that the system clocks of all of its web servers are synchronized. Since two requests for this app may be handled by dif- ferent servers, different requests may see different clocks. The server clock is not con- sistent enough as a source of time data for a real-world application, but it’s good enough for this example. In the next section, we implement this app using Python. We do the same thing with Java in the section “Developing a Java App” on page 39. As before, feel free to skip the section that doesn’t apply to you. Developing a Python App The simplest Python application for App Engine is a single directory with two files: a configuration file named app.yaml, and a file of Python code for a request handler. The directory containing the app.yaml file is the application root directory. You’ll refer to this directory often when using the tools. Developing the Application | 25 Download at WoweBook.Com If you are using the Launcher, you can start a new project by selecting the File menu, New Application.... The Launcher creates a new project with several files, which you may wish to edit to follow along with the example. Alternatively, you can create the project directory and files by hand, then add the project to the Launcher by selecting the File menu, Add Existing Application.... Create a directory named clock to contain the project. Using your favorite text editor, create a file inside this directory named app.yaml similar to Example 2-1. Example 2-1. The app.yaml configuration file for a simple application application: clock version: 1 runtime: python api_version: 1 handlers: - url: /.* script: main.py This configuration file is in a format called YAML, an open format for configuration files and network messages. You don’t need to know much about the format beyond what you see here. In this example, the configuration file tells App Engine that this is version 1 of an application called clock, which uses version 1 of the Python runtime environment (the “API version”). Every request for this application (every URL that matches the regular expression /.*) is to be handled by a Python script named main.py. Create a file named main.py similar to Example 2-2, in the same directory as app.yaml. Example 2-2. A simple Python request handler script import datetime print 'Content-Type: text/html' print '' print '

The time is: %s

' % str(datetime.datetime.now()) This simple Python program imports the datetime module from the Python standard library, prints an HTTP header that indicates the type of the document (HTML), then prints a message containing the current time according to the web server’s clock. Python request handler scripts use the CGI protocol for communicating with App En- gine. When App Engine receives a request for your Python application, App Engine establishes a runtime environment with the request data in environment variables, de- termines which handler script to run using the configuration file, then runs the script with the request body (if any) on the standard input stream. The handler script is ex- pected to perform all necessary actions for the request, then print a response to the 26 | Chapter 2: Creating an Application Download at WoweBook.Com standard output stream, including a valid HTTP header. This simple example ignores the request data, prints a header indicating the type of the response data, then prints a message with the current time to be displayed in the browser. Let’s test what we have so far. Start the development server by running the dev_appserver.py command, specifying the path to the project directory (clock) as an argument: dev_appserver.py clock If your current working directory is the clock directory you just created, you can run the command using a dot (.) as the path to the project: dev_appserver.py . The server starts up and prints several messages to the console. If this is the first time you’re running the server from the command line, it may ask whether you want it to check for updates; type your answer, then hit Enter. You can safely ignore warnings that say “Could not read datastore data” and “Could not initialize images API.” These are expected if you have followed the installation steps so far. The last message should look something like this: INFO ... Running application clock on port 8080: http://localhost:8080 This message indicates the server started successfully. If you do not see this message, check the other messages for hints, and double-check that the syntax of your app.yaml file is correct. Test your application by visiting the server’s URL in a web browser: http://localhost:8080/ The browser displays a page similar to Figure 2-4. Figure 2-4. The first version of the clock application viewed in a browser Developing the Application | 27 Download at WoweBook.Com You can leave the web server running while you develop your application. The web server notices when you make changes to your files, and reloads them automatically as needed. Using the Launcher, you can start the development web server by click- ing the Run button. The icon next to the project turns green when the server starts successfully. To open a browser to view the project, click the Browse button. Introducing the webapp framework The code in Example 2-2 attempts to implement the CGI protocol directly, but you’d never do this for a real application. The Python standard library includes modules, such as the aptly named cgi module, that implement the CGI protocol and perform other common web application tasks. These implementations are complete, fast, and thor- oughly tested, and it’s nearly always better to use modules like these than to implement your own from scratch. Web application frameworks go beyond libraries of modules to implement the best practices of web application development as a coherent suite of tools, components, and patterns: data modeling interfaces, template systems, request handling mechanisms, project management tools, and development environments that work together to re- duce the amount of code you have to write and maintain. There are dozens of web frameworks written in Python, and several are mature, well documented, and have active developer communities. Django, web2py, and Pylons are examples of well-established Python web frameworks. Not every Python web application framework works completely with the App Engine Python runtime environment. Constraints imposed by App Engine’s sandboxing logic, especially the restriction on modules that use compiled C code, limit which frameworks work out of the box. Django (http://www.djangoproject.com/) is known to work well, and others have been adapted with additional software. We’ll discuss how to use Django with App Engine in Chapter 14. To make it easy to get started, App Engine includes a simple web framework called “webapp.” The webapp framework is intended to be small and easy to use. It doesn’t have the features of more established frameworks, but it’s good enough for small projects. If you created the “clock” project using the Launcher, you may have noticed that the starter files use the webapp framework. For simplicity, most of the Python examples in this book use the webapp framework. We won’t cover webapp in detail in this book, but we’ll introduce some of its features here. For larger applications, you may want to use a more featureful framework such as Django. 28 | Chapter 2: Creating an Application Download at WoweBook.Com Let’s upgrade the clock app to use the webapp framework. Replace the contents of main.py with the version shown in Example 2-3. Reload the page in your browser to see the new version in action. (You won’t notice a difference other than an updated time. This example is equivalent to the previous version.) Example 2-3. A simple request handler using the webapp framework from google.appengine.ext import webapp from google.appengine.ext.webapp.util import run_wsgi_app import datetime class MainPage(webapp.RequestHandler): def get(self): time = datetime.datetime.now() self.response.headers['Content-Type'] = 'text/html' self.response.out.write('

The time is: %s

' % str(time)) application = webapp.WSGIApplication([('/', MainPage)], debug=True) def main(): run_wsgi_app(application) if __name__ == '__main__': main() Example 2-3 imports the module google.appengine.ext.webapp, then defines a request handler class called MainPage, a subclass of webapp.RequestHandler. The class defines methods for each HTTP method supported by the handler, in this case one method for HTTP GET called get(). When the application handles a request, it instantiates the handler class, sets self.request and self.response to values the handler method can access and modify, then calls the appropriate handler method, in this case get(). When the handler method exits, the application uses the value of self.response as the HTTP response. The application itself is represented by an instance of the class webapp.WSGIApplication. The instance is initialized with a list of mappings of URLs to handler classes. The debug parameter tells the application to print error messages to the browser window when a handler returns an exception if the application is running under the development web server. webapp detects whether it is running under the development server or running as a live App Engine application, and will not print errors to the browser when running live even if debug is True. You can set it to False to have the development server emulate the live server when errors occur. The script defines a main() function that runs the application using a utility method. Lastly, the script calls main() using the Python idiom of if __name__ == '__main__': ..., a condition that is always true when the script is run by the web server. This idiom allows you to import the script as a module for other code, including the classes and functions defined in the script, without running the main() routine. Developing the Application | 29 Download at WoweBook.Com Defining a main() function this way allows App Engine to cache the compiled handler script, making subsequent requests faster to execute. For more information on app caching, see Chapter 3. A single WSGIApplication instance can handle multiple URLs, routing the request to different RequestHandler classes based on the URL pattern. But we’ve already seen that the app.yaml file maps URL patterns to handler scripts. So which URL patterns should appear in app.yaml, and which should appear in the WSGIApplication? Many web frameworks include their own URL dispatcher logic, and it’s common to route all dy- namic URLs to the framework’s dispatcher in app.yaml. With webapp, the answer mostly depends on how you’d like to organize your code. For the clock application, we will create a second request handler as a separate script to take advantage of a feature of app.yaml for user authentication, but we could also put this logic in main.py and route the URL with the WSGIApplication object. Users and Google Accounts So far, our clock shows the same display for every user. To allow each user to customize the display and save her preferences for future sessions, we need a way to identify the user making a request. An easy way to do this is with Google Accounts. Let’s add something to the page that indicates whether the user is signed in, and pro- vides links for signing in and signing out of the application. Edit main.py to resemble Example 2-4. Example 2-4. A version of main.py that displays Google Accounts information and links from google.appengine.api import users from google.appengine.ext import webapp from google.appengine.ext.webapp.util import run_wsgi_app import datetime class MainPage(webapp.RequestHandler): def get(self): time = datetime.datetime.now() user = users.get_current_user() if not user: navbar = ('

Welcome! Sign in or register to customize.

' % (users.create_login_url(self.request.path))) else: navbar = ('

Welcome, %s! You can sign out.

' % (user.email(), users.create_logout_url(self.request.path))) self.response.headers['Content-Type'] = 'text/html' self.response.out.write(''' The Time Is... 30 | Chapter 2: Creating an Application Download at WoweBook.Com %s

The time is: %s

''' % (navbar, str(time))) application = webapp.WSGIApplication([('/', MainPage)], debug=True) def main(): run_wsgi_app(application) if __name__ == '__main__': main() In a real application, you would use a templating system for the output, separating the HTML and display logic from the application code. While many web application frameworks include a templating system, webapp does not. Since the clock app only has one page, we’ll put the HTML in the handler code, using Python string formatting to keep things organized. The Python runtime environment includes a version of Django, whose templating system can be used with webapp. When Google released version 1 of the Python runtime environment, the latest version of Django was 0.96, so this is what the runtime includes. For more infor- mation on using Django templates with webapp, see the App Engine documentation. Reload the page in your browser. The new page resembles Figure 2-5. Figure 2-5. The clock app with a link to Google Accounts when the user is not signed in Developing the Application | 31 Download at WoweBook.Com The Python API for Google Accounts is provided by the module google.appengine.api.users. The get_current_user() function in this module returns None if the user is not signed in, or an object of the class User with the user’s account information. The email() method on the User object returns the user’s email address. The create_login_url() and create_logout_url() methods generate URLs that go to Google Accounts. Each of these methods takes a URL path for the app where the user should be redirected after performing the desired task. The login URL goes to the Google Accounts page where the user can sign in or register for a new account. The logout URL visits Google Accounts to sign out the current user, then immediately re- directs back to the given application URL. If you click on the “Sign in or register” link with the app running in the development server, the link goes to the development server’s simulated version of the Google Ac- counts sign-in screen, as shown in Figure 2-6. At this screen, you can enter any email address, and the development server will proceed as if you are signed in with an account that has that address. Figure 2-6. The development server’s simulated Google Accounts sign-in screen If this app were running on App Engine, the login and logout URLs would go to the actual Google Accounts locations. Once signed in or out, Google Accounts redirects back to the given URL path for the live application. Click on “Sign in or register,” then click on the Login button on the simulated Google Accounts screen, using the default test email address (test@example.com). The clock app now looks like Figure 2-7. To sign out again, click the “sign out” link. 32 | Chapter 2: Creating an Application Download at WoweBook.Com Figure 2-7. The clock app, with the user signed in Web forms and the datastore Now that we know who the user is, we can ask her for her preferred time zone, remember her preference, and use it on future visits. First, we need a way to remember the user’s preferences so future requests can access them. The App Engine datastore provides reliable, scalable storage for this purpose. The Python API includes a data modeling interface that maps Python objects to data- store entities. We can use it to write a UserPrefs class. Create a file named models.py, as shown in Example 2-5. Example 2-5. The file models.py, with a class for storing user preferences in the datastore from google.appengine.api import users from google.appengine.ext import db class UserPrefs(db.Model): tz_offset = db.IntegerProperty(default=0) user = db.UserProperty(auto_current_user_add=True) def get_userprefs(user_id=None): if not user_id: user = users.get_current_user() if not user: return None user_id = user.user_id() key = db.Key.from_path('UserPrefs', user_id) userprefs = db.get(key) if not userprefs: Developing the Application | 33 Download at WoweBook.Com userprefs = UserPrefs(key_name=user_id) return userprefs The Python data modeling interface is provided by the module google.appengine.ext.db. A data model is a class whose base class is db.Model. The model subclass defines the structure of the data in each object using class properties. This structure is enforced by db.Model when values are assigned to instance properties. For our UserPrefs class, we define two properties: tz_offset, an integer, and user, a User object returned by the Google Accounts API. Every datastore entity has a primary key. Unlike a primary key in a relational database table, an entity key is permanent and can only be set when the entity is created. A key is unique across all entities in the system, and consists of several parts, including the entity’s kind (in this case 'UserPrefs'). An app can set one component of the key to an arbitrary value, known in the API as the key name. The clock application uses the user’s unique ID, provided by the user_id() method of the User object, as the key name of a UserPrefs entity. This allows the app to fetch the entity by key, since it knows the user’s ID from the Google Accounts API. Fetching the entity by key is faster than performing a datastore query. In models.py, we define a function named get_userprefs() that gets the UserPrefs ob- ject for the user. After determining the user ID, the function constructs a datastore key for an entity of the kind 'UserPrefs' with a key name equivalent to the user ID. If the entity exists in the datastore, the function returns the UserPrefs object. If the entity does not exist in the datastore, the function creates a new UserPrefs object with default settings and a key name that corresponds to the user. The new object is not saved to the datastore automatically. The caller must invoke the put() method on the UserPrefs instance to save it. Now that we have a mechanism for getting a UserPrefs object, we can make two up- grades to the main page. If the user is signed in, we can get the user’s preferences (if any) and adjust the clock’s time zone. Let’s also display a web form so the user can set a time zone preference. Edit main.py to resemble Example 2-6 to implement these features. Example 2-6. A new version of main.py that adjusts the clock to the user’s time zone and displays a preferences form from google.appengine.api import users from google.appengine.ext import webapp from google.appengine.ext.webapp.util import run_wsgi_app import datetime import models class MainPage(webapp.RequestHandler): def get(self): time = datetime.datetime.now() user = users.get_current_user() 34 | Chapter 2: Creating an Application Download at WoweBook.Com if not user: navbar = ('

Welcome! Sign in or register to customize.

' % (users.create_login_url(self.request.path))) tz_form = '' else: userprefs = models.get_userprefs() navbar = ('

Welcome, %s! You can sign out.

' % (user.nickname(), users.create_logout_url(self.request.path))) tz_form = '''
''' % userprefs.tz_offset time += datetime.timedelta(0, 0, 0, 0, 0, userprefs.tz_offset) self.response.headers['Content-Type'] = 'text/html' self.response.out.write(''' The Time Is... %s

The time is: %s

%s ''' % (navbar, str(time), tz_form)) application = webapp.WSGIApplication([('/', MainPage)], debug=True) def main(): run_wsgi_app(application) if __name__ == '__main__': main() To enable the preferences form, we need a request handler to parse the form data and update the datastore. Let’s implement this as a new request handler script. Create a file named prefs.py with the contents shown in Example 2-7. Example 2-7. A new handler script, prefs.py, for the preferences form from google.appengine.ext import webapp from google.appengine.ext.webapp.util import run_wsgi_app import models class PrefsPage(webapp.RequestHandler): def post(self): Developing the Application | 35 Download at WoweBook.Com userprefs = models.get_userprefs() try: tz_offset = int(self.request.get('tz_offset')) userprefs.tz_offset = tz_offset userprefs.put() except ValueError: # User entered a value that wasn't an integer. Ignore for now. pass self.redirect('/') application = webapp.WSGIApplication([('/prefs', PrefsPage)], debug=True) def main(): run_wsgi_app(application) if __name__ == '__main__': main() This request handler handles HTTP POST requests to the URL /prefs, which is the URL (“action”) and HTTP method used by the form. The handler calls the get_userprefs() function from models.py to get the UserPrefs object for the current user, which is either a new unsaved object with default values, or the object for an existing entity. The handler parses the tz_offset parameter from the form data as an integer, sets the property of the UserPrefs object, then saves the object to the datastore by calling its put() method. The put() method creates the object if it doesn’t exist, or updates the existing object. If the user enters a noninteger in the form field, we don’t do anything. It’d be appro- priate to return an error message, but we’ll leave this as is to keep the example simple. Finally, edit app.yaml to map the handler script to the URL /prefs in the handlers: section, as shown in Example 2-8. Example 2-8. A new version of app.yaml mapping the URL /prefs, with login required application: clock version: 1 runtime: python api_version: 1 handlers: - url: /prefs script: prefs.py login: required - url: /.* script: main.py The login: required line says that the user must be signed in to Google Accounts to access the /prefs URL. If the user accesses the URL while not signed in, App Engine automatically directs the user to the Google Accounts sign-in page, then redirects her 36 | Chapter 2: Creating an Application Download at WoweBook.Com back to this URL afterward. This makes it easy to require sign-in for sections of your site, and to ensure that the user is signed in before the request handler is called. Be sure to put the /prefs URL mapping before the /.* mapping. URL patterns are tried in order, and the first pattern to match determines the handler used for the request. Since the pattern /.* matches all URLs, /prefs must come first or it will be ignored. Reload the page to see the customizable clock in action. Try changing the time zone by submitting the form. Also try signing out, then signing in again using the same email address, and again with a different email address. The app remembers the time zone preference for each user. Caching with memcache The code that gets user preferences data in Example 2-5 fetches an entity from the datastore every time a signed-in user visits the site. User preferences are often read and seldom changed, so getting a UserPrefs object from the datastore with every request is more expensive than it needs to be. We can mitigate the costs of reading from primary storage using a caching layer. We can use the memcache service as secondary storage for user preferences data. We can add caching with just a few changes to models.py. Edit this file as shown in Exam- ple 2-9. Example 2-9. A new version of models.py that caches UserPrefs objects in memcache from google.appengine.api import memcache from google.appengine.api import users from google.appengine.ext import db class UserPrefs(db.Model): tz_offset = db.IntegerProperty(default=0) user = db.UserProperty(auto_current_user_add=True) def cache_set(self): memcache.set(self.key().name(), self, namespace=self.key().kind()) def put(self): self.cache_set() db.Model.put(self) def get_userprefs(user_id=None): if not user_id: user = users.get_current_user() if not user: return None user_id = user.user_id() userprefs = memcache.get(user_id, namespace='UserPrefs') if not userprefs: key = db.Key.from_path('UserPrefs', user_id) userprefs = db.get(key) Developing the Application | 37 Download at WoweBook.Com if userprefs: userprefs.cache_set() else: userprefs = UserPrefs(key_name=user_id) return userprefs The Python API for the memcache service is provided by the module google.appengine.api.memcache. The memcache stores key-value pairs, with an op- tional namespace for the key. The value can be of any type that can be converted to and from a flat data representation (serialized) using the Python pickle module, in- cluding most data objects. The new version of the UserPrefs class overrides the put() method. When the put() method is called on an instance, the instance is saved to the memcache, then it is saved to the datastore using the original put() method. (db.Model.put(self) is one way to call the overridden superclass method in Python.) A new UserPrefs method called cache_set() makes the call to memcache.set(). memcache.set() takes a key, a value, and an optional namespace for the key. Here, we use the entity’s key name as the key, the full object (self) as the value, and the entity’s kind ('UserPrefs') as the namespace. The API takes care of serializing the UserPrefs object, so we can put in and take out fully formed objects. The new version of get_userprefs() checks the memcache for the UserPrefs object before going to the datastore. If it finds it in the cache, it uses it. If it doesn’t, it checks the datastore, and if it finds it there, it stores it in the cache and uses it. If the object is in neither the memcache nor the datastore, get_userprefs() returns a fresh UserPrefs object with default values. Reload the page to see the new version work. To make the caching behavior more visible, you can add logging statements in the appropriate places, like so: import logging class UserPrefs(db.Model): # ... def cache_set(self): logging.info('cache set') # ... The development server prints logging output to the console. If you are using the Launcher, you can open a window of development server output by clicking the Logs button. Next, we’ll take a look at the same example using the Java runtime environment. If you’re not interested in Java, you can skip ahead to “Registering the Applica- tion” on page 55. 38 | Chapter 2: Creating an Application Download at WoweBook.Com Developing a Java App Java web applications for App Engine use the Java Servlet standard interface for inter- acting with the application server. An application consists of one or more classes that extend a servlet base class. Servlets are mapped to URLs using a standard configuration file called a “deployment descriptor.” When App Engine receives a request for a Java application, it determines which servlet class to use based on the URL and the deploy- ment descriptor, instantiates the class, then calls an appropriate method on the servlet object. All of the files for a Java application, including the compiled Java classes, configuration files, and static files, are organized in a standard directory structure called a Web Ap- plication Archive, or “WAR.” Everything in the WAR directory gets deployed to App Engine. It’s common to have your development workflow build the contents of the WAR from a set of source files, either using an automated build process or WAR-aware development tools. If you are using the Eclipse IDE with the Google Plugin, you can create a new project using the Web Application wizard. From the File menu, select New, then Web Appli- cation Project. In the window that opens, enter a project name (such as Clock) and package name (such as clock). Uncheck the “Use Google Web Toolkit” checkbox, and make sure the “Use Google App Engine” checkbox is checked. (If you leave the GWT checkbox checked, the new project will be created with GWT starter files.) Click Finish to create the project. If you are not using the Google Plugin for Eclipse, you will need to create the directories and files another way. If you are already familiar with Java web development, you can use your existing tools and processes to produce the final WAR. For the rest of this section, we’ll assume the directory structure created by the Eclipse plug-in. Figure 2-8 shows the project file structure, as depicted in the Eclipse Package Explorer. The project root directory (Clock) contains two major subdirectories: src and war. The src/ directory contains all of the project’s class files in the usual Java package structure. With a package path of clock, Eclipse created source code for a servlet class named ClockServlet in the file clock/ClockServlet.java. The war/ directory contains the complete final contents of the application. Eclipse compiles source code from src/ automatically and puts the compiled class files in war/ WEB-INF/classes/, which is hidden from Eclipse’s Package Explorer by default. Eclipse copies the contents of src/META-INF/ to war/WEB-INF/classes/META-INF/ automat- ically, as well. Everything else must be created in the war/ directory in its intended location. Let’s start our clock application with a simple servlet that displays the current time. Open the file src/clock/ClockServlet.java for editing (creating it if necessary), and give it contents similar to Example 2-10. Developing the Application | 39 Download at WoweBook.Com Example 2-10. A simple Java servlet package clock; import java.io.IOException; import java.io.PrintWriter; import java.text.SimpleDateFormat; import java.util.Date; import java.util.SimpleTimeZone; import javax.servlet.http.*; @SuppressWarnings("serial") public class ClockServlet extends HttpServlet { public void doGet(HttpServletRequest req, HttpServletResponse resp) throws IOException { SimpleDateFormat fmt = new SimpleDateFormat("yyyy-MM-dd hh:mm:ss.SSSSSS"); fmt.setTimeZone(new SimpleTimeZone(0, "")); resp.setContentType("text/html"); PrintWriter out = resp.getWriter(); out.println("

The time is: " + fmt.format(new Date()) + "

"); } } The servlet class extends javax.servlet.http.HttpServlet, and overrides methods for each of the HTTP methods it intends to support. This servlet overrides the doGet() Figure 2-8. A new Java project structure, as shown in the Eclipse Package Explorer 40 | Chapter 2: Creating an Application Download at WoweBook.Com method to handle HTTP GET requests. The server calls the method with an HttpServletRequest object and an HttpServletResponse object as parameters. The HttpServletRequest contains information about the request, such as the URL, form parameters, and cookies. The method prepares the response using methods on the HttpServletResponse, such as setContentType() and getWriter(). App Engine sends the response when the servlet method exits. To tell App Engine to invoke this servlet for requests, we need a deployment descriptor. Open or create the file war/WEB-INF/web.xml, and give it contents similar to Exam- ple 2-11. Example 2-11. The web.xml file, also known as the deployment descriptor, mapping all URLs to ClockServlet clock clock.ClockServlet clock /* Eclipse may open this file in its XML “Design” view, a table-like view of the elements and values. Select the “Source” tab at the bottom of the editor pane to edit the XML source. web.xml is an XML file with a root element of . To map URL patterns to servlets, you declare each servlet with a element, then declare the mapping with a element. The of a servlet mapping can be a full URL path, or a URL path with a * at the beginning or end to represent a part of a path. In this case, the URL pattern /* matches all URLs. Be sure that each of your values starts with a forward slash (/). Omitting the starting slash may have the intended behavior on the development web server but unintended behavior on App Engine. App Engine needs one additional configuration file that isn’t part of the servlet standard. Open or create the file war/WEB-INF/appengine-web.xml, and give it con- tents similar to Example 2-12. Developing the Application | 41 Download at WoweBook.Com Example 2-12. The appengine-web.xml file, with App Engine-specific configuration for the Java app clock 1 In this example, the configuration file tells App Engine that this is version 1 of an application called clock. You can also use this configuration file to control other be- haviors, such as static files and sessions. For more information, see Chapter 3. The WAR for the application must include several JARs from the App Engine SDK: the Java EE implementation JARs, and the App Engine API JAR. The Eclipse plug-in installs these JARs in the WAR automatically. If you are not using the Eclipse plug-in, you must copy these JARs manually. Look in the SDK directory in the lib/user/ and lib/shared/ subdirectories. Copy every .jar file from these directories to the war/WEB-INF/lib/ directory in your project. Finally, the servlet class must be compiled. Eclipse compiles all of your classes auto- matically, as needed. If you are not using Eclipse, you probably want to use a build tool such as Apache Ant to compile source code and perform other build tasks. See the official App Engine documentation for information on using Apache Ant to build App Engine projects. I suppose it’s traditional to explain how to compile a Java project from the command line using the javac command. You can do so by putting each of the JARs from war/ WEB-INF/lib/ and the war/WEB-INF/classes/ directory in the classpath, and making sure the compiled classes end up in the classes/ directory. But in the real world, you want your IDE or an Ant script to take care of this for you. Also, when we introduce the datastore in the next few sections, we will need to add a step to building the project that makes this even more impractical to do by hand. It’s time to test this application with the development web server. The Eclipse plug-in can run the application and the development server inside the Eclipse debugger. To start it, select the Run menu, Debug As, and Web Application. The server starts, and prints the following message to the Console panel: The server is running at http://localhost:8080/ If you are not using Eclipse, you can start the development server using the dev_appserver command (dev_appserver.sh for Mac OS X or Linux). The command takes the path to the WAR directory as an argument, like so: dev_appserver war Test your application by visiting the server’s URL in a web browser: http://localhost:8080 The browser displays a page similar to the Python example, shown earlier in Figure 2-4. 42 | Chapter 2: Creating an Application Download at WoweBook.Com Users and Google Accounts Right now, our clock displays the time in the UTC time zone. We’d like for our appli- cation to let the user customize the time zone, and to remember the user’s preference for future visits. To do that, we will use Google Accounts to identify which user is using the application. Edit ClockServlet.java to resemble Example 2-13. Example 2-13. Code for ClockServlet.java that displays Google Accounts information and links package clock; import java.io.IOException; import java.io.PrintWriter; import java.text.SimpleDateFormat; import java.util.Date; import java.util.SimpleTimeZone; import javax.servlet.http.*; import com.google.appengine.api.users.User; import com.google.appengine.api.users.UserService; import com.google.appengine.api.users.UserServiceFactory; @SuppressWarnings("serial") public class ClockServlet extends HttpServlet { public void doGet(HttpServletRequest req, HttpServletResponse resp) throws IOException { SimpleDateFormat fmt = new SimpleDateFormat("yyyy-MM-dd hh:mm:ss.SSSSSS"); fmt.setTimeZone(new SimpleTimeZone(0, "")); UserService userService = UserServiceFactory.getUserService(); User user = userService.getCurrentUser(); String navBar; if (user != null) { navBar = "

Welcome, " + user.getNickname() + "! You can sign out.

"; } else { navBar = "

Welcome! Sign in or register to customize.

"; } resp.setContentType("text/html"); PrintWriter out = resp.getWriter(); out.println(navBar); out.println("

The time is: " + fmt.format(new Date()) + "

"); } } In a real application, you wouldn’t mix HTML and Java source code like this. You can use JavaServer Pages (JSPs) to represent the page, or you can use a templating engine to render output. To keep things simple, we will continue to write HTML directly from the servlet code for the rest of this example, but keep in mind this is not a best practice. Developing the Application | 43 Download at WoweBook.Com Using Eclipse, you can leave the development web server running while you edit code. When you save changes to code, Eclipse compiles the class, and if it compiles success- fully, Eclipse injects the new class into the already-running server. In most cases, you can simply reload the page in your browser, and it will use the new code. If you are not using Eclipse, shut down the development server by hitting Ctrl-C. Re- compile your project, then start the server again. Reload the new version of the clock app in your browser. The new page resembles the Python example, shown previously in Figure 2-5. This version of the clock app uses the interface for Google Accounts provided by the com.google.appengine.api.users package. The app gets a UserService instance by call- ing the getUserService() method of the UserServiceFactory class. Then it calls the getCurrentUser() method of the UserService, which returns a User object, or null if the current user is not signed in. The getEmail() method of the User object returns the email address for the user. The createLoginURL() and createLogoutURL() methods of the UserService generate URLs that go to Google Accounts. Each of these methods takes a URL path for the app where the user should be redirected after performing the desired task. The login URL goes to the Google Accounts page where the user can sign in or register for a new account. The logout URL visits Google Accounts to sign out the current user, then immediately redirects back to the given application URL without displaying anything. If you click on the “Sign in or register” link with the app running in the development server, the link goes to the development server’s simulated version of the Google Ac- counts sign-in screen, similar to the Python version shown earlier in Figure 2-6. At this screen, you can enter any email address, and the development server will proceed as if you are signed in with an account that has that address. If this app were running on App Engine, the login and logout URLs would go to the actual Google Accounts locations. Once signed in or out, Google Accounts redirects back to the given URL path for the live application. Click on “Sign in or register,” then enter an email address (such as test@example.com) and click on the Login button on the simulated Google Accounts screen. The clock app now looks like Figure 2-7 (shown earlier). To sign out again, click the “sign out” link. In addition to the UserService API, an app can also get information about the current user with the servlet “user principal” interface. The app can call the getUserPrincipal() method on the HttpServletRequest object to get a java.security.Principal object, or null if the user is not signed in. This object has a getName() method, which in App Engine is equivalent to calling the getEmail() method of a User object. The main advantage to getting user information from the servlet interface is that the servlet interface is a standard. Coding an app to use standard interfaces makes the app easier to port to alternate implementations, such as other servlet-based web application 44 | Chapter 2: Creating an Application Download at WoweBook.Com environments or private servers. As much as possible, App Engine implements standard interfaces for its services and features. The disadvantage to the standard interfaces is that not all standard interfaces represent all of App Engine’s features, and in some cases the App Engine services don’t implement every feature of an interface. All of the services include a nonstandard “low-level” API, which you can use directly or use to implement adapters to other interfaces. Web forms and the datastore Now that we can identify the user, we can prompt for the user’s preferences and re- member them for future requests. We can store preferences data in the App Engine datastore. The App Engine SDK supports two major standard interfaces for accessing the datastore: Java Data Objects (JDO) 2.3 and the Java Persistence API (JPA) 1.0. As with the other services, the datastore also has a low-level API. Let’s use the JPA interface to store the user’s time zone setting. JPA requires a config- uration file that specifies the JPA implementation to use, and other options. The final location of this file is war/WEB-INF/classes/META-INF/persistence.xml. If you are using Eclipse, you can create this file as src/META-INF/persistence.xml, and Eclipse will copy it to the final location automatically. Create the file src/META-INF/persistence.xml with the contents shown in Exam- ple 2-14. Example 2-14. The JPA configuration file, persistence.xml, with several useful options org.datanucleus.store.appengine.jpa.DatastorePersistenceProvider The application interacts with the datastore using an EntityManager object. It gets this object from an EntityManagerFactory. For efficiency, it’s best to instantiate the factory only once per servlet. You can store this instance in a static member of a wrapper class. Create a new class named EMF in the clock package (src/clock/EMF.java) resembling Example 2-15. Developing the Application | 45 Download at WoweBook.Com Example 2-15. The file EMF.java, a static wrapper class for an EntityManagerFactory instance package clock; import javax.persistence.EntityManagerFactory; import javax.persistence.Persistence; public final class EMF { private static final EntityManagerFactory emfInstance = Persistence.createEntityManagerFactory("transactions-optional"); private EMF() {} public static EntityManagerFactory get() { return emfInstance; } } JPA makes your Java data objects persistent. As far as the rest of your application is concerned, the data objects are just plain old Java objects (POJOs), with members and methods. When you create an instance of a data class, you declare it as persistent by passing it to the EntityManager. From that point on, JPA ensures that changes to the object are saved to the datastore. When you retrieve the object from the datastore later, it still has all of its data, and it retains its persistent behavior. When you define a JPA data class, you declare it as a persistent-capable class, and optionally tell JPA how to save and restore instances of the class. You do this using Java annotations in the class definition. Example 2-16 shows the code for a user preferences data class called UserPrefs, using annotations to declare it as persistent-capable by JPA. Create this class (src/clock/UserPrefs.java). Example 2-16. Code for UserPrefs.java, a data class using JPA annotations to make instances persistent package clock; import javax.persistence.Basic; import javax.persistence.Entity; import javax.persistence.EntityManager; import javax.persistence.Id; import com.google.appengine.api.users.User; import clock.EMF; import clock.UserPrefs; @Entity(name = "UserPrefs") public class UserPrefs { @Id private String userId; private int tzOffset; 46 | Chapter 2: Creating an Application Download at WoweBook.Com @Basic private User user; public UserPrefs(String userId) { this.userId = userId; } public String getUserId() { return userId; } public int getTzOffset() { return tzOffset; } public void setTzOffset(int tzOffset) { this.tzOffset = tzOffset; } public User getUser() { return user; } public void setUser(User user) { this.user = user; } public static UserPrefs getPrefsForUser(User user) { UserPrefs userPrefs = null; EntityManager em = EMF.get().createEntityManager(); try { userPrefs = em.find(UserPrefs.class, user.getUserId()); if (userPrefs == null) { userPrefs = new UserPrefs(user.getUserId()); userPrefs.setUser(user); } } finally { em.close(); } return userPrefs; } public void save() { EntityManager em = EMF.get().createEntityManager(); try { em.persist(this); } finally { em.close(); } } } Developing the Application | 47 Download at WoweBook.Com The UserPrefs class has three members: the user ID, the user’s time zone preference, and the User object representing the current user (which contains the user’s email ad- dress). The class is declared as persistence-capable using the @Entity annotation. Its name argument specifies the name to be used in JPA queries, typically the same as the class name. By default, the name of the underlying datastore entity kind is derived from the simple name of the class, in this case UserPrefs. The user field gets an @Basic annotation because JPA does not recognize its field type (User) as persistence-capable by default. String and int are understood by JPA as persistence-capable by default. The userId field is the primary key for the object, as declared by the @Id annotation. Unlike a relational database, the key is not a field of a record, but a permanent aspect of the underlying datastore entity, set when the object is first saved. When the data class uses a String member as the primary key, JPA expects the member to be set to the key name, a value unique across all objects of this class, when the object is saved for the first time. This value cannot be changed once the object is saved. This application creates a UserPrefs object for each user with preferences set, using the unique user ID provided by Google Accounts as the key name of the object’s key. When a user visits the clock, the app attempts to get the UserPrefs object via the key con- structed from the user ID, and adjusts the clock display accordingly if such an object is found. There are other ways to declare the primary key using JPA and App Engine, including a way to let the datastore assign a unique ID automatically. These are discussed in Chapter 8. JPA attaches its plumbing to the data class after the class is compiled, in a step called “enhancement.” The annotations tell the enhancement process how to modify the compiled class bytecode, adding calls to the JPA API in places that ensure the object’s persistent members are saved to the datastore. If you are using Eclipse and the plug-in, JPA enhancement happens automatically when data classes are compiled. The App Engine SDK includes an Ant plug-in that performs this step, and you can also run the enhancement process from your own build process by running a tool. See the official App Engine documentation for more information on performing the JPA class en- hancement step from a build script. In anticipation of adding caching logic later on, we’ve included two methods on this class for getting and saving UserPrefs objects. The static method getPrefsForUser() takes a User object, as returned by the Google Accounts API, and attempts to fetch the UserPrefs object from the datastore for that user. The instance method save() stores the object in the datastore, creating a new datastore entity if one does not already exist for this key, or updating the existing entity if one does. (This save() method goes against JPA’s notion of automatic object persistence, but is a convenient way to integrate memcache later with very little code.) 48 | Chapter 2: Creating an Application Download at WoweBook.Com It’s time to upgrade the clock application to allow the user to customize the time zone of the clock. Example 2-17 shows a new version of the ClockServlet class that retrieves the UserPrefs object for the currently signed-in user, if any, and uses it to customize the clock display. It also displays a web form that the user can submit to change her time zone preference. Example 2-17. A new version of ClockServlet.java that adjusts the clock to the user’s time zone and displays a preferences form // ... import clock.UserPrefs; @SuppressWarnings("serial") public class ClockServlet extends HttpServlet { public void doGet(HttpServletRequest req, HttpServletResponse resp) throws IOException { SimpleDateFormat fmt = new SimpleDateFormat("yyyy-MM-dd hh:mm:ss.SSSSSS"); UserService userService = UserServiceFactory.getUserService(); User user = userService.getCurrentUser(); String navBar; String tzForm; if (user == null) { navBar = "

Welcome! Sign in or register to customize.

"; tzForm = ""; fmt.setTimeZone(new SimpleTimeZone(0, "")); } else { UserPrefs userPrefs = UserPrefs.getPrefsForUser(user); int tzOffset = 0; if (userPrefs != null) { tzOffset = userPrefs.getTzOffset(); } navBar = "

Welcome, " + user.getEmail() + "! You can sign out.

"; tzForm = "
" + "" + "" + "" + "
"; fmt.setTimeZone(new SimpleTimeZone(tzOffset * 60 * 60 * 1000, "")); } resp.setContentType("text/html"); PrintWriter out = resp.getWriter(); out.println(navBar); out.println("

The time is: " + fmt.format(new Date()) + "

"); Developing the Application | 49 Download at WoweBook.Com out.println(tzForm); } } To enable the preferences form, we need a servlet to parse the form data and update the datastore. Let’s implement this as a new servlet class. Create a class named PrefsServlet (src/clock/PrefsServlet.java) with the contents shown in Example 2-18. Example 2-18. Code for PrefsServlet.java, a servlet that handles the preferences form package clock; import java.io.IOException; import javax.servlet.http.HttpServlet; import javax.servlet.http.HttpServletRequest; import javax.servlet.http.HttpServletResponse; import com.google.appengine.api.users.User; import com.google.appengine.api.users.UserService; import com.google.appengine.api.users.UserServiceFactory; import clock.UserPrefs; @SuppressWarnings("serial") public class PrefsServlet extends HttpServlet { public void doPost(HttpServletRequest req, HttpServletResponse resp) throws IOException { UserService userService = UserServiceFactory.getUserService(); User user = userService.getCurrentUser(); UserPrefs userPrefs = UserPrefs.getPrefsForUser(user); try { int tzOffset = new Integer(req.getParameter("tz_offset")).intValue(); userPrefs.setTzOffset(tzOffset); userPrefs.save(); } catch (NumberFormatException nfe) { // User entered a value that wasn't an integer. Ignore for now. } resp.sendRedirect("/"); } } Next, we need to change web.xml to map this servlet to the /prefs URL. Edit this file and add the XML shown in Example 2-19. Example 2-19. Mapping the URL /prefs to the PrefsServlet using a security constraint in web.xml (excerpt) prefs clock.PrefsServlet 50 | Chapter 2: Creating an Application Download at WoweBook.Com prefs /prefs prefs /prefs * The order in which the URL mappings appear in the file does not matter. Longer pat- terns (not counting wildcards) match before shorter ones. The block tells App Engine that only users signed in with a Google Account can access the URL /prefs. If a user who is not signed in attempts to access this URL, App Engine redirects the user to Google Accounts to sign in. When the user signs in, she is directed back to the URL she attempted to access. A security constraint is a convenient way to implement Google Accounts authentication for a set of URLs. In this case, it means that PrefsServlet does not need to handle the case where someone tries to submit data to the URL without being signed in. The servlet accesses the form data using the HttpServletRequest object passed to the doPost() method. For now, if the user enters a noninteger in the form field, we don’t do anything. Later, we can implement an error message. If the form data is valid, the servlet sets the value on the UserPrefs object obtained from our getPrefsForUser() method, then calls our save() method. The save() method opens a PersistenceManager, attaches the UserPrefs object (perhaps making it persis- tent for the first time), then closes the PersistenceManager to save the object in the datastore. Finally, PrefsServlet redirects the user back to the main page. Redirecting after the form submission allows the user to reload the main page without resubmitting the form. Restart your development server, then reload the page to see the customizable clock in action. Try changing the time zone by submitting the form. Also try signing out, then signing in again using the same email address, and again with a different email address. The app remembers the time zone preference for each user. Caching with memcache So far, our application fetches the object from the datastore every time a signed-in user visits the site. Since user preferences data doesn’t change very often, we can speed up the per-request data access using the memory cache as secondary storage. Developing the Application | 51 Download at WoweBook.Com As with the other services, the App Engine SDK includes two interfaces to the memcache service: a featureful proprietary API, and an API that conforms to a proposed Java standard known as JCache (JSR 107). We could use either for this example; for now, we’ll use the proprietary API. Because we limited fetching and saving UserPrefs objects to two methods, we can im- plement the caching of UserPrefs objects with minimal changes. Example 2-20 shows the needed changes to the UserPrefs class. Example 2-20. Changes for UserPrefs.java to implement caching of UserPrefs objects import java.io.Serializable; import com.google.appengine.api.memcache.MemcacheService; import com.google.appengine.api.memcache.MemcacheServiceException; import com.google.appengine.api.memcache.MemcacheServiceFactory; // ... @SuppressWarnings("serial") @Entity(name = "UserPrefs") public class UserPrefs implements Serializable { // ... @SuppressWarnings("unchecked") public static UserPrefs getPrefsForUser(User myUser) { UserPrefs userPrefs = null; String cacheKey = "UserPrefs:" + myUser.getUserId(); try { MemcacheService memcache = MemcacheServiceFactory.getMemcacheService(); if (memcache.contains(cacheKey)) { userPrefs = (UserPrefs) memcache.get(cacheKey); return userPrefs; } // If the UserPrefs object isn't in memcache, // fall through to the datastore. } catch (MemcacheServiceException e) { // If there is a problem with the cache, // fall through to the datastore. } EntityManager em = EMF.get().createEntityManager(); try { userPrefs = em.find(UserPrefs.class, myUser.getUserId()); if (userPrefs == null) { userPrefs = new UserPrefs(myUser.getUserId()); userPrefs.setUser(myUser); } else { try { MemcacheService memcache = MemcacheServiceFactory.getMemcacheService(); memcache.put(cacheKey, userPrefs); } catch (MemcacheServiceException e) { // Ignore cache problems, nothing we can do. } 52 | Chapter 2: Creating an Application Download at WoweBook.Com } } finally { em.close(); } return userPrefs; } public void save() { EntityManager em = EMF.get().createEntityManager(); try { em.persist(this); } finally { em.close(); } } } Any object you store in the memcache must be serializable; that is, it must implement the Serializable interface from the java.io package. For UserPrefs, it suffices to de- clare that the class implements the interface, since all the relevant members are already serializable. The new version of the getPrefsForUser() static method checks to see whether the UserPrefs object for the given user is present in the cache before going to the datastore. Each cache value is stored with a key, which itself can be any serializable object. For UserPrefs objects, we use a cache key equivalent to the string "UserPrefs:" followed by the email address from the User object. If a value with that key is not in the cache, or if there is a problem accessing the cache, the method proceeds to get the object from the datastore, then stores it in the cache by calling a new helper method, cacheSet(). Similarly, the new version of the save() method stores the object in the datastore, then also stores it in the cache. There is no way to guarantee that the cache and the datastore contain the same value if one of the services fails and the other succeeds, but it’s usually sufficient to save to the datastore first, then save to the cache. For further safety, we could set the cache values to expire after a period of time, so if they do get out of sync, it won’t be for long. As written, the cache values persist as long as possible in memory. Reload the page to see the new version work. To make the caching behavior more visible, you can add logging statements in the appropriate places, like so: import java.util.logging.*; import javax.persistence.Transient; // ... public class UserPrefs implements Serializable { @Transient private static Logger logger = Logger.getLogger(UserPrefs.class.getName()); // ... if (cache.containsKey(cacheKey)) { logger.warning("CACHE HIT"); Developing the Application | 53 Download at WoweBook.Com userPrefs = (UserPrefs) cache.get(cacheKey); return userPrefs; } logger.warning("CACHE MISS"); The development server prints logging output to the console. If you are using Eclipse, these messages appear in the Console pane. The Development Console The Python and Java development web servers include a handy feature for inspecting and debugging your application while testing on your local machine: a web-based de- velopment console. With your development server running, visit the following URL in a browser to access the console: http://localhost:8080/_ah/admin In the Python Launcher, you can also click the SDK Console button to open the console in a browser window. The Java development console is currently behind the Python console in features, but it’s catching up. Figure 2-9 shows the datastore viewer in the Python console. Figure 2-9. The development console’s datastore viewer, Python version The Python development console’s datastore viewer lets you list and inspect entities by kind, edit entities, and create new ones. You can edit the values for existing prop- erties, but you cannot delete properties or add new ones, nor can you change the type of the value. For new entities, the console makes a guess as to which properties belong on the entity based on existing entities of that kind, and displays a form to fill in those properties. Similarly, you can only create new entities of existing kinds, and cannot create new kinds from the console. 54 | Chapter 2: Creating an Application Download at WoweBook.Com The Python console also has a viewer for the memcache. You can see cache statistics, and inspect, create, and delete keys. Values are displayed and edited in their serialized (“pickled”) form. An especially powerful feature of the Python console is the “Interactive Console.” This feature lets you type arbitrary Python code directly into a web form and see the results displayed in the browser. You can use this to write ad hoc Python code to test and manipulate the datastore, memcache, and global data within the local development server. Here’s an example: run your clock application, sign in with an email address, then set a time zone preference, such as -8. Now open the Python development console, then select “Interactive Console.” In the lefthand text box, enter the following, where -8 is the time zone preference you used: from google.appengine.ext import db import models q = models.UserPrefs.gql("WHERE tz_offset = -8") for prefs in q: print prefs.user Click the Run Program button. The code runs, and the email address you used appears in the righthand box. Code run in the development console behaves just like application code. If you perform a datastore query that needs a custom index, the development server adds configuration for that index to the application’s index.yaml configuration file. Datastore index con- figuration is discussed in Chapter 5. You can use the Python console to inspect the application’s task queue and cron job configuration in the browser. You can also use the task queue inspector to see tasks currently on the queue (in the local instance of the app), run them, and flush them. (The development server does not run task queues in the background; you must run them from the console. See Chapter 13.) Lastly, you can test how your app receives email and XMPP messages by sending it mock messages through the console. The Java development server also has a console. It includes a datastore viewer that lets you list and inspect datastore entities by kind, the ability to run task queues, and the ability to send email and XMPP messages to the app. Registering the Application Before you can upload your application to App Engine and share it with the world, you must first create a developer account, then register an application ID. If you intend to use a custom domain name (instead of the free appspot.com domain name included Registering the Application | 55 Download at WoweBook.Com with every app), you must also set up the Google Apps service for the domain. You can do all of this from the App Engine Administration Console. To access the Administration Console, visit the following URL in your browser: https://appengine.google.com/ Sign in using the Google account you intend to use as your developer account. If you don’t already have a Google account (such as a Gmail account), you can create one using any email address. Once you have signed in, the Console displays a list of applications you have created, if any, and a button to “Create an Application,” similar to Figure 2-10. From this screen, you can create and manage multiple applications, each with its own URL, configura- tion, and resource limits. Figure 2-10. The Administration Console application list, with one app When you register your first application ID, the Administration Console prompts you to verify your developer account using an SMS message sent to your mobile phone. After you enter your mobile phone number, Google sends an SMS to your phone with a confirmation code. Enter this code to continue the registration process. You can verify only one account per phone number, so if you have only one mobile number (like most people), be sure to use it with the account you intend to use with App Engine. If you don’t have a mobile phone number, you can apply to Google for manual verifi- cation by filling out a web form. This process takes about a week. For information on applying for manual verification, see the official App Engine website. You can have up to 10 active application IDs created by a given developer account. If you decide you do not want an app ID, you can disable it using the Administration Console to reclaim one of your 10 available apps. Disabling an app makes the app inaccessible by the public, and disables portions of the Console for the app. Disabling an app does not free the application ID for someone else to register. You can request that a disabled app be deleted permanently. To disable or request deletion of an app, go to “Application Settings” in the Adminis- tration Console, and click the Disable Application... button. When you request 56 | Chapter 2: Creating an Application Download at WoweBook.Com deletion, all developers of the app are notified by email, and if nobody cancels the request, the app is deleted after 24 hours. The Application ID and Title When you click the “Create an Application” button, the Console prompts for an ap- plication identifier. The application ID must be unique across all App Engine applica- tions, just like an account username. The application ID identifies your application when you interact with App Engine using the developer tools. The tools get the application ID from the application configuration file. For Python applications, you specify the app ID in the app.yaml file, on the application: line. For Java applications, you enter it in the element of the appengine-web.xml file. In the example earlier in this chapter, we chose the application ID “clock” arbitrarily. If you’d like to try uploading this application to App Engine, remember to edit the appropriate configuration file after you register the application to change the application ID to the one you chose. The application ID is part of the domain name you can use to test the application running on App Engine. Every application gets a free domain name that looks like this: app-id.appspot.com The application ID is also part of the email and XMPP addresses the app can use to receive incoming messages. See Chapter 11. Because the application ID is used in the domain name, an ID can contain only low- ercase letters, numbers, or hyphens, and must be shorter than 32 characters. Addi- tionally, Google reserves every Gmail username as an application ID that only the corresponding Gmail user can register. As with usernames on most popular websites, a user-friendly application ID may be hard to come by. When you register a new application, the Console also prompts for an “application title.” This title is used to represent your application throughout the Console and the rest of the system. In particular, it is displayed to a user when the application directs the user to sign in with a Google account. Make sure the title is as you would want your users to see it. Once you have registered an application, its ID cannot be changed, though you can delete the application and create a new one. You can change the title for an app at any time from the Administration Console. Registering the Application | 57 Download at WoweBook.Com Setting Up a Domain Name If you are developing a professional or commercial application, you probably want to use your own domain name instead of the appspot.com domain as the official location of your application. You can set up a custom domain name for your App Engine app using Google’s “software as a service” service, Google Apps. Google Apps provides hosted applications for your business or organization, including email (with Gmail and POP/IMAP interfaces), calendaring (Google Calendar), chat (Google Talk), hosted word processing, spreadsheets and presentations (Google Docs), easy-to-edit websites (Google Sites), video hosting, and so forth. You associate these services with your organization’s domain name by mapping the domain to Google’s servers in its DNS record, either by letting Google manage the DNS for the domain or by pointing subdomains to Google in your own DNS configuration. Your organiza- tion’s members access the hosted services using your domain name. With App Engine, you can add your own applications to subdomains of your domain. Even if you do not intend to use the other Google Apps services, you can use Google Apps to associate your own domain with your App Engine application. The website for Google Apps indicates that Standard Edition accounts are “ad-supported.” This refers to ads that appear on Google products such as Gmail. It does not refer to App Engine: Google does not place ads on the pages of App Engine applications, even those using free ac- counts. Of course, you can put ads on your own sites, but that’s your choice—and your ad revenue. If you have not set up Google Apps for your domain already, you can do so during the application ID registration process. You can also set up Google Apps from the Admin- istration Console after you have registered the app ID. If you haven’t yet purchased a domain name, you can do so while setting up Google Apps, and you can host the domain on Google’s name servers for free. To use a domain you purchased previously, follow the instructions on the website to point the domain to Google’s servers. Once you have set up Google Apps for a domain, you can access the Google Apps dashboard at a URL similar to the following: http://www.google.com/a/example.com To add an App Engine application as a service, click the “Add more services” link, then find Google App Engine in the list. Enter the application ID for your app, then click “Add it now.” On the following settings screen, you can configure a subdomain of your domain name for the application. All web traffic to this subdomain will go to the application. 58 | Chapter 2: Creating an Application Download at WoweBook.Com Google Apps does not support routing web traffic for the top-level do- main (such as http://example.com/) directly to an App Engine app. If you bought the domain name through Google, an HTTP request to the top-level domain will redirect to http://www.example.com, and you can assign the “www” subdomain to your App Engine app. If Google does not maintain the DNS record for your domain, you will need to set up the redirect yourself using a web server associated with the top-level domain. By default, the subdomain “www” is assigned to Google Sites, even if you do not have the Sites app activated. To release this subdomain for use with App Engine, first enable the Sites service, then edit the settings for Sites and remove the “www” subdomain. Google Apps and Authentication Google Apps allows your organization’s members (employees, contractors, volunteers) to create user accounts with email addresses that use your domain name (such as juliet@example.com). Members can sign in with these accounts to access services that are private to your organization, such as email or word processing. Using Apps ac- counts, you can restrict access to certain documents and services to members of the organization, like a hosted intranet that members can access from anywhere. You can also limit access to your App Engine applications to just those users with accounts on the domain. This lets you use App Engine for internal applications such as project management or sales reporting. When an App Engine application is restricted to an organization’s domain, only members of the organization can sign in to the ap- plication’s Google Accounts prompt. Other Google accounts are denied access. The authentication restriction must be set when the application is registered, in the “Authentication Options” section of the registration form. The default setting allows any user with a Google account to sign in to the application, leaving it up to the appli- cation to decide how to respond to each user. When the app is restricted to a Google Apps domain, only users with Google accounts on the domain can sign in. After the application ID has been registered, the authentication options cannot be changed. If you want different authentication options for your application, you must register a new application ID. The restriction applies only to the application’s use of Google Accounts. If the appli- cation has any URLs that can be accessed without signing in to Google Accounts (such as a welcome page), those URLs will still be accessible by everyone. One of the simplest ways to restrict access to a URL is with application configuration. For example, a Py- thon application can require sign-in for all URLs with the following in the app.yaml file: handlers: - url: /.* Registering the Application | 59 Download at WoweBook.Com script: main.py login: required A Java app can do something similar in the application’s deployment descriptor (web.xml). See Chapter 3. The sign-in restriction applies even when the user accesses the app using the appspot.com domain. The user does not need to be accessing the app with the Apps domain for the authentication restriction to be enforced. If you or other members of your organization want to use Google Apps accounts as developer accounts, you must access the Administration Console using a special URL. For example, if your Apps domain is example.com, you would use the following URL to access the Administration Console: https://appengine.google.com/a/example.com You sign in to the domain’s Console with your Apps account (for instance, juliet@example.com). If you create an app using a non-Apps account, then restrict its authentication to the domain, you will still be able to access the Administration Console using the non-Apps account. However, you will not be able to sign in to the app with that account, including when accessing URLs restricted to administrators. Uploading the Application In a traditional web application environment, releasing an application to the world can be a laborious process. Getting the latest software and configuration to multiple web servers and backend services in the right order and at the right time to minimize down- time and prevent breakage is often difficult and delicate. With App Engine, deployment is as simple as uploading the files with a single click or command. You can upload and test multiple versions of your application, and set any uploaded version to be the current public version. For Python apps, you can upload an app from the Launcher, or from a command prompt. From the Launcher, select the app to deploy, then click the “Deploy” button. From a command prompt, run the appcfg.py command as follows, substituting the path to your application directory for clock: appcfg.py update clock As with dev_appserver.py, clock is just the path to the directory. If the current working directory is the clock/ directory, you can use the relative path, a dot (.). For Java apps, you can upload from Eclipse using the Google plug-in, or from a com- mand prompt. In Eclipse, click the “Deploy to App Engine” button (the little App Engine logo) in the Eclipse toolbar. Or from a command prompt, run the appcfg (or 60 | Chapter 2: Creating an Application Download at WoweBook.Com appcfg.sh) command from the SDK’s bin/ directory as follows, using the path to your application’s WAR directory for war: appcfg update war When prompted by these tools, enter your developer account’s email address and password. The tools remember your credentials for subsequent runs so you don’t have to enter them every time. The upload process determines the application ID and version number from the app configuration file—app.yaml for Python apps, appengine-web.xml for Java apps—then uploads and installs the files and configuration as the given version of the app. After you upload an application for the first time, you can access the application immediately using either the .appspot.com subdomain or the custom Google Apps domain you set up earlier. For example, if the application ID is clock, you can access the application with the following URL: http://clock.appspot.com/ There is no way to download an application’s files from App Engine after they have been uploaded. Make sure you are retaining copies of your application files, such as with a revision control system and regular backups. Introducing the Administration Console You manage your live application from your browser using the App Engine Adminis- tration Console. You saw the Console when you registered the application, but as a reminder, you can access the Console at the following URL: https://appengine.google.com/ If your app uses a Google Apps domain name and you are using an Apps account on the domain as your developer account, you must use the Apps address of the Admin- istration Console: https://appengine.google.com/a/example.com Select your application (click its ID) to go to the Console for the app. The first screen you see is the “dashboard,” shown in Figure 2-11. The dashboard summarizes the current and past status of your application, including traffic and load, resource usage, and error rates. You can view charts for the request rate, the amount of time spent on each request, error rates, bandwidth and CPU usage, and whether your application is hitting its resource limits. If you’ve already tested your new application, you should see a spike in the requests- per-second chart. The scale of the chart goes up to the highest point in the chart, so the Introducing the Administration Console | 61 Download at WoweBook.Com spike reaches the top of the graph even though you have only accessed the application a few times. The Administration Console is your home base for managing your live application. From here, you can examine how the app is using resources, browse the application’s request and message logs, and query the datastore and check the status of its indexes. You can also manage multiple versions of your app, so you can test a newly uploaded version before making it the live “default” version. You can invite other people to be developers of the app, allowing them to access the Administration Console and upload new files. And when you’re ready to take on large amounts of traffic, you can establish a billing account, set a daily budget, and monitor expenses. Take a moment to browse the Console, especially the Dashboard, Quota Details, and Logs sections. Throughout this book, we will discuss how an application consumes system resources, and how you can optimize an app for speed and cost effectiveness. The Administration Console is your main resource for tracking resource consumption and diagnosing problems. Figure 2-11. The Administration Console dashboard for a new app 62 | Chapter 2: Creating an Application Download at WoweBook.Com CHAPTER 3 Handling Web Requests A web application is an application that responds to requests over the Web. Ideally, a web application is an application that responds to web requests quickly, doing the smallest amount of work required to return a response. Most web apps serve users interacting with the application in real time, and a fast response means less time the user is waiting for an action to be completed or information to be displayed. With user- facing web apps, milliseconds matter. A less obvious advantage to an app that responds quickly is that it’s easier to scale. The less work the app does in response to a request, the more efficiently those requests can be distributed across multiple servers. It’s like scheduling meetings on a busy day: the shorter the meeting, the more likely you’ll be able to fit it in. Apps with faster responses are also more tolerant of system faults. An app receiving 100 queries per second of traffic will have fewer simultaneous requests in progress at a given moment in time if each request takes 10 milliseconds than if each request takes 100 milliseconds. If a machine goes down and a portion of the requests in progress must be canceled, fewer users will be affected, and more subsequent new requests will be routed to other machines. App Engine is designed for web applications that respond to requests quickly. An app that can respond within tens of milliseconds is doing pretty well. Occasionally, an app must take hundreds of milliseconds, such as to save data to the datastore or contact a remote server. If an app routinely takes a long time to respond to requests, App Engine triages the slow requests to make room in the schedule for faster ones. In this chapter, we’ll take a look at App Engine’s request handling architecture, and follow the path of a web request through the system. We’ll discuss how to configure the system to handle different kinds of requests, including requests for static content, requests for the application to perform work, and secure connections (HTTP over SSL, also known as HTTPS). Finally, we’ll take a close look at the application runtime en- vironments for Python and Java, how App Engine invokes an application to respond to requests, and how to take advantage of the environment to speed up request handling. 63 Download at WoweBook.Com The App Engine Architecture The architecture of App Engine—and therefore an App Engine application—can be summarized as shown in Figure 3-1. Figure 3-1. The App Engine request handling architecture The first stop for an incoming request is the App Engine frontend. A load balancer, a dedicated system for distributing requests optimally across multiple machines, routes the request to one of many frontend servers. The frontend determines the app for which the request is intended from the request’s domain name, either the Google Apps domain and subdomain or the appspot.com subdomain. It then consults the app’s configuration to determine the next step. The app’s configuration describes how the frontends should treat requests based on their URL paths. A URL path may map to a static file that should be served to the client directly, such as an image or a file of JavaScript code. Or, a URL path may map to a request handler, application code that is invoked to determine the response for the request. You upload this configuration data along with the rest of your application. We’ll look at how to configure URL paths and static files in the next section. If the URL path for a request does not match anything in the app’s configuration, the frontends return an HTTP 404 “Not Found” error response to the client. The frontends return a generic error response. If you want clients to receive a custom response when accessing your app (such as an HTTP 404 error code with a friendly HTML message), you can map a URL pattern that matches all URLs to a request handler that returns the custom response. If the URL path of the request matches the path of one of the app’s static files, the frontend routes the request to the static file servers. These servers are dedicated to the task of serving static files, with network topology and caching behavior optimized for 64 | Chapter 3: Handling Web Requests Download at WoweBook.Com fast delivery of resources that do not change often. You tell App Engine about your app’s static files in the app’s configuration. When you upload the app, these files are pushed to the static file servers. If the URL path of the request matches a pattern mapped to one of the application’s request handlers, the frontend sends the request to the app servers. The app server pool starts up an instance of the application on a server, or reuses an existing instance if there is one already running from a previous request. The server invokes the app by calling the request handler that corresponds with the URL path of the request, accord- ing to the app configuration. You can configure the frontend to authenticate the user with Google Accounts. The frontend can restrict access to URL paths with several levels of authorization: all users, users who have signed in, and users who are application administrators. With a Google Apps domain, you can also set your application to allow only users on the domain to access URLs. The frontend checks whether the user is signed in, and redirects the user to the Google Accounts sign-in screen if needed. We’ll look at authentication and au- thorization later in this chapter. The app servers use one of several strategies for distributing requests and starting up instances, depending on the app’s traffic and resource usage patterns. As of this writing, the specifics of these strategies are still being developed and tuned, but they are all intended to work best with request handlers that return quickly. The goal is to maximize the throughput of app instances, so as many instances are running as needed to handle the current traffic levels, but not so many that instances are sitting around doing noth- ing. The app servers are also designed so that one request handler cannot interfere with the behavior or performance of another handler running on the same server. When an app server receives a request for your application, the server checks the URL configuration to determine which of the application’s request handlers should process the request. The server invokes the request handler and awaits its response. The server manages the local resources available to the app, including CPU cycles, memory, and execution time, and ensures that apps do not consume system resources in a way that interferes with other apps. Your application code executes in a runtime environment, an abstraction above the server hardware and operating system that provides access to system resources and other services. The runtime environment is a “sandbox,” a walled arena that lets the application use only the features of the server that can scale without interfering with other apps. For instance, the sandbox prevents the application from writing to the server’s filesystem, or from making arbitrary network connections to other hosts. Applications can access various services to perform tasks outside of the runtime envi- ronment. For instance, the URL Fetch service allows an app to make HTTP requests to remote machines, using Google’s infrastructure for fetching web pages. These services are the same scalable services used by Google’s own applications, such as Gmail, Google Reader, and Picasa. They provide a scalable alternative to performing The App Engine Architecture | 65 Download at WoweBook.Com similar tasks directly on the app server. All app servers use the same services, so data saved to the datastore by one request handler can be accessed by another. The request handler prepares the response, then returns it and terminates. The app server does not send any data to the client until the request handler has terminated, so it cannot stream data or keep a connection open for a long time. When the handler terminates, the app server returns the response, and the request is complete. The frontend takes the opportunity to tailor the response to the client. Most notably, the frontend will compress the response data using the “gzip” format if the client gives some indication that it supports compressed responses. This applies to both app re- sponses and static file responses, and is done automatically. The frontend uses several techniques to determine when it is appropriate to compress responses, based on web standards and known browser behaviors. (If you are using a custom client that does not support compressed content, simply omit the “Accept-Encoding” request header to disable the automatic gzip behavior.) The frontends, app servers, and static file servers are governed by an “app master.” Among other things, the app master is responsible for deploying new versions of ap- plication software and configuration, and updating the “default” version served on an app’s user-facing domain. Updates to an app propagate quickly, but are not atomic in the sense that only code from one version of an app is running at any one time. If you switch the default version to new software, all requests that started before the switch are allowed to complete using their version of the software. (An app that makes an HTTP request to itself might find itself in a pickle, but you can manage that situation in your own code, if you really need to.) Configuring the Frontend You control how the frontend routes requests for your application using configuration files. These files reside alongside your application’s code and static files in your appli- cation directory. When you upload your application, all of these files are uploaded together as a single logical unit. Let’s take a look at how to configure the frontend for an application. First, we’ll look at the overall layout and syntax for the configuration files for a Python app and for a Java app. Then we’ll discuss each frontend feature, with examples for each runtime environment. Configuring a Python App A Python application consists of files, including Python code for request handlers and libraries, static files, and configuration files. On your computer, these files reside in the application root directory. Static files and application code may reside in the root 66 | Chapter 3: Handling Web Requests Download at WoweBook.Com directory or in subdirectories. Configuration files always reside in fixed locations in the root directory. You configure the frontend for a Python application using a file named app.yaml in the application root directory. This file is in a format called YAML, a concise human-readable data format with support for nested structures like sequences and mappings. Example 3-1 shows an example of a simple app.yaml file. We’ll discuss these features in the following sections. For now, notice a few things about the structure of the file: • The file is a mapping of values to names. For instance, the value python is associated with the name runtime. • Values can be scalars (python, 1), sequences of other values, or mappings of values to names. The value of handlers is a sequence of two values, each of which is a mapping containing two name-value pairs. • Order is significant in sequences, but not mappings. • YAML uses indentation to indicate scope. • YAML supports all characters in the Unicode character set. The encoding is as- sumed to be UTF-8 unless the file uses a byte order mark signifying UTF-16. • A YAML file can contain comments. All characters on a line after a # character are ignored, unless the # is in a quoted string value. Example 3-1. An example of an app.yaml configuration file application: ae-book version: 1 runtime: python api_version: 1 handlers: - url: /css static_dir: css - url: /.* script: main.py Runtime versions Among other things, this configuration file declares that this application (or, specifi- cally, this version of this application) uses the Python runtime environment. It also declares which version of the Python runtime environment to use. As of this writing, there is only one version of the Python runtime environment. If Google ever makes changes to the runtime environment that may be incompatible with existing applica- tions, the changes will be released using a new version number. Your app will continue to use the version of the runtime environment specified in your configuration file, giving you a chance to test your app with the new version before uploading the new configuration. Configuring the Frontend | 67 Download at WoweBook.Com You specify the name and version of the runtime environment in app.yaml using the runtime and api_version elements, like so: runtime: python api_version: 1 Configuring a Java App A Java application consists of files bundled in a standard format called WAR (short for “Web application archive”). The WAR standard specifies the layout of a directory structure for a Java web application, including the locations of several standard con- figuration files, compiled Java classes, JAR files, static files, and other auxiliary files. Some tools that manipulate WARs support compressing the directory structure into a single file similar to a JAR. App Engine’s tools generally expect the WAR to be a di- rectory on your computer’s filesystem. Java servlet applications use a file called a “deployment descriptor” to specify how the server invokes the application. This file uses an XML format, and is part of the servlet standard specification. In a WAR, the deployment descriptor is a file named web.xml that resides in a directory named WEB-INF/, which itself is in the WAR’s root directory. Example 3-2 shows a very simple deployment descriptor. Example 3-2. An example of a web.xml deployment descriptor file ae-book aebook.MainServlet ae-book /* The deployment descriptor tells the App Engine frontend most of what it needs to know, but not all. For the rest, App Engine uses a file named appengine-web.xml, also in the WEB-INF/ directory and also using XML syntax. If your code editor supports XML validation, you can find the schema definition for this file in the App Engine Java SDK. Example 3-3 shows a brief example. Example 3-3. An example of an appengine-web.xml configuration file ae-book 68 | Chapter 3: Handling Web Requests Download at WoweBook.Com 1 When Google releases major new features for the Java API, the release includes a new version of the SDK with an updated appengine-api-... .jar file. App Engine knows which version of the API the app is expecting by examining the API JAR included in the app’s WAR. The server may replace the JAR with a different but compatible implementation when the app is run. Domain Names Every app gets a free domain name on appspot.com, based on the application ID: app-id.appspot.com Requests for URLs that use your domain name are routed to your app by the frontend. http://app-id.appspot.com/url/path... You can register your own domain name (such as example.com) and set it up with Google Apps to point to your app. You assign a subdomain of your top-level domain to the app. For instance, if your registered domain is example.com and you assign the www subdomain, the domain name for the app is: www.example.com Google Apps does not support routing requests for the top-level domain without a subdomain. If you want users to see something when they visit http://example.com/, you must use your own domain name service (DNS) and web server to handle traffic to that domain name, and point subdomains to Google Apps in the DNS record. If you use the Google Apps DNS service for the domain, Google Apps will automatically redirect web requests for the bare domain to the www subdomain. The appspot.com domain has a couple of useful features. One such feature is the ability to accept an additional domain name part: anything.app-id.appspot.com Requests for domain names of this form, where anything is any valid single domain name part (that cannot contain a dot, .), are routed to the application. This is useful for accepting different kinds of traffic on different domain names, such as for allowing your users to serve content from their own subdomains. Only appspot.com domains support the additional part. Google Apps domains do not. You can determine which domain name was used for the request in your application code by checking the Host header on the request. Here’s how you check this header using Python and webapp: Configuring the Frontend | 69 Download at WoweBook.Com class MainHandler(webapp.RequestHandler): def get(self): host = self.request.headers['Host'] self.response.out.write('Host: %s' % host) App IDs and Versions Every App Engine application has an application ID that uniquely distinguishes the app from all other applications. As described in Chapter 2, you can register an ID for a new application using the Administration Console. Once you have an ID, you add it to the app’s configuration so the developer tools know that the files in the app root directory belong to the app with that ID. The app’s configuration also includes a version identifier. Like the app ID, the version identifier is associated with the app’s files when the app is uploaded. App Engine retains one set of files and frontend configuration for each distinct version identifier used dur- ing an upload. If you do not change the app version in the configuration when you upload, the existing files for that version of the app are replaced. Each distinct version of the app is accessible at its own domain name, of the following form: version-id.latest.app-id.appspot.com When you have multiple versions of an app uploaded to App Engine, you can use the Administration Console to select which version is the one you want the public to access. The Console calls this the “default” version. When a user visits your Google Apps domain (and configured subdomain), or the appspot.com domain without the version ID, he sees the default version. The appspot.com domain containing the version ID supports an additional domain part, just like the default appspot.com domain: anything.version-id.latest.app-id.appspot.com Unless you explicitly prevent it, anyone who knows your application ID and version identifiers can access any uploaded version of your appli- cation using the appspot.com URLs. You can restrict access to nondefault versions of the application using code that checks the domain of the request and only allows authorized users to access the versioned do- mains. You can’t restrict access to static files this way. Another way to restrict access to nondefault versions is to use Google Accounts authorization, described later in this chapter. You can restrict access to app administrators while a version is in development, then replace the configuration to remove the restriction just before making that version the default version. 70 | Chapter 3: Handling Web Requests Download at WoweBook.Com All versions of an app access the same datastore, memcache, and other services, and all versions share the same set of resources. Later on, we’ll discuss other configuration files that control these backend services. These files are separate from the configuration files that control the frontend because they are not specific to each app version. There are several ways to use app versions. For instance, you can have just one version, and always update it in place. Or you can have a “dev” version for testing and a “live” version that is always the public version, and do separate uploads for each. Some de- velopers generate a new app version identifier for each upload based on the version numbers used by a source code revision control system. You can have up to 10 active versions. You can delete previous versions using the Administration Console. Application IDs and version identifiers can contain numbers, lowercase letters, and hyphens. App IDs and versions in Python For a Python app, the application ID and version identifier appear in the app.yaml file. The app ID is specified with the name application. The version ID is specified as version. Here is an example of app.yaml using dev as the version identifier: application: ae-book version: dev This would be accessible using this domain name: http://dev.latest.ae-book.appspot.com App IDs and versions in Java The app ID and version identifier of a Java app appear in the appengine-web.xml file. The app ID is specified with the XML element , and the version identifier is specified with . For example: ae-book dev As in the Python example, this version of this app would be accessible using this domain name: http://dev.latest.ae-book.appspot.com Configuring the Frontend | 71 Download at WoweBook.Com Request Handlers The app configuration tells the frontend what to do with each request, routing it to either the application servers or the static file servers. The destination is determined by the URL path of the request. For instance, an app might send all requests whose URL paths start with /images/ to the static file server, and all requests for the site’s home page (the path /) to the app servers. The configuration specifies a list of patterns that match URL paths, with instructions for each pattern. For requests intended for the app servers, the configuration also specifies the request handler responsible for specific URL paths. A request handler is an entry point into the application code. In Python, a request handler is a script of Python code. In Java, a request handler is a servlet class. Each runtime environment has its own interface for invoking the application. Request handlers in Python All URL paths for Python apps are described in the app.yaml file using the handlers element. The value of this element is a sequence of mappings, where each item includes a pattern that matches a set of URL paths and instructions on how to handle requests for those paths. Here is an example with four URL patterns: handlers: - url: /profile/.* script: userprofile.py - url: /css static_dir: css - url: /info/(.*\.xml) static_files: /datafiles/\1 - url: /.* script: main.py The url element in a handler description is a regular expression that matches URL paths. Every path begins with a forward slash (/), so a pattern can match the beginning of a path by also starting with this character. This URL pattern matches all paths: url: /.* If you are new to regular expressions, here is the briefest of tutorials: the . character matches any single character, and the * character says the previous symbol, in this case any character, can occur zero or more times. There are several other characters with special status in regular expressions. All other characters, like /, match literally. So this pattern matches any URL that begins with a / followed by zero or more of any character. If a special character is preceded by a backslash (\), it is treated as a literal character in the pattern. Here is a pattern that matches the exact path /home.html: url: /home\.html 72 | Chapter 3: Handling Web Requests Download at WoweBook.Com See the Python documentation for the re module for an excellent introduction to regular expressions. The actual regular expression engine used for URL patterns is not Py- thon’s, but it’s similar. App Engine attempts to match the URL path of a request to each handler pattern in the order the handlers appear in the configuration file. The first pattern that matches determines the handler to use. If you use the catchall pattern /.*, make sure it’s the last one in the list, since a later pattern will never match. To map a URL path pattern to application code, you provide a script element. The value is the path to a Python source file, relative to the application root directory. If the frontend gets a request whose path matches a script handler, it routes the request to an application server to invoke the script and produce the response. In the previous example, the following handler definition routes all URL paths that begin with /profile/ to a script named userprofile.py: - url: /profile/.* script: userprofile.py We’ll take a closer look at how App Engine invokes a script handler later in this chapter. Request handlers in Java A Java web application maps URL patterns to servlets in the deployment descriptor (web.xml). You set up a servlet in two steps: the servlet declaration, and the servlet mapping. The element declares a servlet. It includes a , a name for the purposes of referring to the servlet elsewhere in the file, and the , the name of the class that implements the servlet. Here’s a simple example: ae-book aebook.MainServlet The servlet declaration can also define initialization parameters for the servlet. This is useful if you want to use the same servlet class in multiple servlet declarations, with different parameters for each one. For example: ae-book aebook.MainServlet colorscheme monochrome background dark Configuring the Frontend | 73 Download at WoweBook.Com To map a servlet to a URL path pattern, you use the element. A mapping includes the that matches a servlet declaration, and a . ae-book /home/* The URL pattern matches the URL path. It can use a * character at the beginning or end of the pattern to represent zero or more of any character. Note that this wildcard can only appear at the beginning or end of the pattern, and you can only use one wildcard per pattern. The order in which URL mappings appear is not significant. The “most specific” matching pattern wins, determined by the number of nonwildcard characters in the pattern. The pattern /* matches all URLs, but will only match if none of the other patterns in the deployment descriptor match the URL. We’ll take a closer look at servlets and the Java runtime environment later in this chapter. App Engine includes support for JavaServer Pages (JSPs). JSPs are dynamic web pages defined using a mix of HTML (or other output text) and Java code. JSPs are compiled to Java classes equivalent to servlets that output the static content and evaluate the Java code. You can build large dynamic websites using JSPs for all user-facing servlets. From the developer’s point of view, working with JSPs is a lot like working with files of static HTML content. Like other JSP-capable web servers, App Engine compiles JSP files automatically, so an additional compilation step is usually not necessary. When you upload the app, App Engine compiles the JSPs, and stores the compiled servlet classes. No compilation occurs on the app servers themselves. JSPs reside in the application directory, outside of the WEB-INF/ directory. The file- name of a JSP must end with the characters .jsp. By default, each JSP is mapped au- tomatically to a URL path equivalent to the path to the JSP file from the application root. So if a JSP file’s path is forum/home.jsp, it can be accessed with the URL path /forum/home.jsp. As we’ll see later in this chapter, this is also how URLs for static files work for Java. This lets you store JSPs and static files together in the application directory, and refer to them using intuitive paths in the HTML. You can also set an explicit URL mapping for a JSP in the deployment descriptor by declaring it as a servlet. Instead of a element, use a element that contains the path to the file from the application root. JavaServer Pages (JSPs). 74 | Chapter 3: Handling Web Requests Download at WoweBook.Com forum-home /forum/home.jsp forum-home /forum App Engine includes the JavaServer Pages Standard Tag Library (JSTL), a standard library of extensions for use with JSPs. You do not need to add it to your app or install it—and in fact doing so might conflict with the one included with App Engine. For more information on JSPs and the JSTL, see Head First Servlets and JSP by Brian Basham et al. (O’Reilly) and JavaServer Pages by Hans Bergsten (O’Reilly). Static Files and Resource Files Most web applications have a set of files that are served verbatim to all users, and do not change as the application is used. These can be media assets like images used for site decoration, CSS stylesheets that describe how the site should be drawn to the screen, JavaScript code to be downloaded and executed by a web browser, or HTML for full pages with no dynamic content. To speed up the delivery of these files and improve page rendering time, App Engine uses dedicated servers for static content. Using dedicated servers also means the app servers don’t have to spend resources on requests for static files. Both the deployment process and the frontend must be told which of the application’s files are static files. The deployment process delivers static files to the dedicated servers. The frontend remembers which URL paths refer to static files, so it can route requests for those paths to the appropriate servers. The static file configuration can also include a recommendation for a cache expiration interval. App Engine returns the cache instructions to the client in the HTTP header along with the file. If the client chooses to heed the recommendation, it will retain the file for up to that amount of time, and use its local copy instead of asking for it again. This reduces the amount of bandwidth used, but at the expense of browsers retaining old copies of files that may have changed. To save space and reduce the amount of data involved when setting up new app in- stances, static files are not pushed to the application servers. This means application code cannot access the contents of static files using the filesystem. The files that do get pushed to the application servers are known as “resource files.” These can include app-specific configuration files, web page templates, or other static data that is read by the app but not served directly to clients. Application code can access these files by reading them from the filesystem. The code itself is also accessible this way. Configuring the Frontend | 75 Download at WoweBook.Com There are ways to specify that a file is both a resource file and a static file, depending on which runtime environment you are using. Static files in Python We’ve seen how request handlers defined in the app.yaml file can direct requests to scripts that run on the app servers. Handler definitions can also direct requests to the static file servers. There are two ways to specify static file handlers. The easiest is to declare a directory of files as static, and map the entire directory to a URL path. You do this with the static_dir element, as follows: handlers: - url: /images static_dir: myimgs This says that all of the files in the directory myimgs/ are static files, and the URL path for each of these files is /images/ followed by the directory path and filename of the file. If the app has a file at the path myimgs/people/frank.jpg, App Engine pushes this file to the static file servers, and serves it whenever someone requests the URL path /images/people/frank.jpg. Notice that with static_dir handlers, the url pattern does not include a regular ex- pression to match the subpath or filename. The subpath is implied: whatever appears in the URL path after the URL pattern becomes the subpath to the file in the directory. The other way to specify static files is with the static_files element. With static_files, you use a full regular expression for the url. The URL pattern can use regular expression groups to match pieces of the path, then use those matched pieces in the path to the file. The following is equivalent to the static_dir handler above: - url: /images/(.*) static_files: myimgs/\1 upload: myimgs/.* The parentheses in the regular expression identify which characters are members of the group. The \1 in the file path is replaced with the contents of the group when looking for the file. You can have multiple groups in a pattern, and refer to each group by number in the file path. Groups are numbered in the order they appear in the pattern from left to right, where \1 is the leftmost pattern, \2 is the next, and so on. When using static_files, you must also specify an upload element. This is a regular expression that matches paths to files in the application directory on your computer. App Engine needs this pattern to know which files to upload as static files, since it cannot determine this from the static_files pattern alone (as it can with static_dir). While developing a Python app, you keep the app’s static files in the application di- rectory along with the code and configuration files. When you upload the app, App Engine determines which files are static files from the handler definitions in 76 | Chapter 3: Handling Web Requests Download at WoweBook.Com app.yaml. Files mentioned in static file handler definitions are pushed to the static file servers. All other files in the application directory are considered resource files, and are pushed to the application servers. As such, static files are not accessible to the appli- cation code via the filesystem. The Python SDK treats every file as either a resource file or a static file. If you have a file that you want treated as both a resource file (available to the app via the filesystem) and a static file (served verbatim from the static file servers), you can create a symbolic link in the project directory to make the file appear twice to the deployment tool under two separate names. The file will be uploaded twice, and count as two files toward the file count limit. When the data of an HTTP response is of a particular type, such as a JPEG image, and the web server knows the type of the data, the server can tell the client the type of the data using an HTTP header in the response. The type can be any from a long list of standard type names, known as MIME types. If the server doesn’t say what the type of the data is, the client has to guess, and may guess incorrectly. By default, for static files, App Engine makes its own guess of the file type based on the last few characters of the filename (such as .jpeg). If the filename does not end in one of several known extensions, App Engine serves the file as the MIME type application/ octet-stream, a generic type most web browsers treat as generic binary data. If this is not sufficient, you can specify the MIME type of a set of static files using the mime_type element in the static file handler configuration. For example: - url: docs/(.*)\.ps static_files: psoutput/\1.dat upload: psoutput/.*\.dat mime_type: application/postscript This says that the application has a set of datafiles in a directory named psoutput/ whose filenames end in .dat, and these should be served using URL paths that consist of docs/, followed by the filename with the .dat replaced with .ps. When App Engine serves one of these files, it declares that the file is a PostScript document. You can also specify mime_type with a static_dir handler. All files in the directory are served with the declared type. It’s common for a static file to be used on multiple web pages of a site. Since static files seldom change, it would be wasteful for a web browser to download the file every time the user visits a page. Instead, browsers can retain static files in a cache on the user’s hard drive, and reuse the files when they are needed. To do this, the browser needs to know how long it can safely retain the file. The server can suggest a maximum cache expiration in the HTTP response. You can configure the cache expiration period App Engine suggests to the client. MIME types. Cache expiration. Configuring the Frontend | 77 Download at WoweBook.Com To set a default cache expiration period for all static files for an app, you specify a default_expiration value. This value applies to all static file handlers, and belongs at the top level of the app.yaml file, like so: application: ae-book version: 1 runtime: python api_version: 1 default_expiration: "5d 12h" handlers: # ... The value is string that specifies a number of days, hours, minutes, and seconds. As shown here, each number is followed by a unit (d, h, m, or s), and values are separated by spaces. You can also specify an expiration value for static_dir and static_files handlers individually, using an expiration element in the handler definition. This value overrides the default_expiration value, if any. For example: handlers: - url: /docs/latest static_dir: /docs expiration: "12h" If the configuration does not suggest a cache expiration period for a set of static files, App Engine does not give an expiration period when serving the files. Browsers will use their own caching behavior in this case, and may not cache the files at all. Sometimes you want a static file to be cached in the browser as long as possible, but then replaced immediately when the static file changes. A common technique is to add a version number for the file to the URL, then use a new version number from the app’s HTML when the file changes. The browser sees a new URL, assumes it is a new re- source, and fetches the new version. You can put the version number of the resource in a fake URL parameter, such as /js/code.js?v=19, which gets ignored by the static file server. Alternatively, in Py- thon, you can use regular expression matching to match all versions of the URL and route them to the same file in the static file server, like so: - handlers: url: /js/(.*)/code.js static_files: js/code.js expiration: "90d" This handler serves the static file js/code.js for all URLs such as /js/v19/code.js, using a cache expiration of 90 days. 78 | Chapter 3: Handling Web Requests Download at WoweBook.Com If you’d like browsers to reload a static file resource automatically every time you launch a new major version of the app, you can use the mul- tiversion URL handler just discussed, then use the CURRENT_VERSION_ID environment variable as the “version” in the static file URLs: self.response.out('