Wednesday, July 2, 2008

Google is Learning to Crawl Flash

Until recently, webmasters were afraid to create flash sites because search engines weren't able to crawl and index them. But now, Google has announced that their bots are learning to crawl flash. They have been developing an algorithm that can read text embedded in flash, from "flash menus, buttons and banners, to self-contained Flash websites."

Here are some questions and answers from the announcement made by Google in Google Webmaster Central:

Q: Which Flash files can Google better index now?We've improved our ability to index textual content in SWF files of all kinds. This includes Flash "gadgets" such as buttons or menus, self-contained Flash websites, and everything in between.

Q: What content can Google better index from these Flash files?All of the text that users can see as they interact with your Flash file. If your website contains Flash, the textual content in your Flash files can be used when Google generates a snippet for your website. Also, the words that appear in your Flash files can be used to match query terms in Google searches.

In addition to finding and indexing the textual content in Flash files, we're also discovering URLs that appear in Flash files, and feeding them into our crawling pipeline—just like we do with URLs that appear in non-Flash webpages. For example, if your Flash application contains links to pages inside your website, Google may now be better able to discover and crawl more of your website.

Q: What about non-textual content, such as images?At present, we are only discovering and indexing textual content in Flash files. If your Flash files only include images, we will not recognize or index any text that may appear in those images. Similarly, we do not generate any anchor text for Flash buttons which target some URL, but which have no associated text.

Also note that we do not index FLV files, such as the videos that play on YouTube, because these files contain no text elements.

Q: How does Google "see" the contents of a Flash file?We've developed an algorithm that explores Flash files in the same way that a person would, by clicking buttons, entering input, and so on. Our algorithm remembers all of the text that it encounters along the way, and that content is then available to be indexed. We can't tell you all of the proprietary details, but we can tell you that the algorithm's effectiveness was improved by utilizing Adobe's new Searchable SWF library.

This is truly an amazing accomplishment, considering how many website designs are still made with flash. Website owners, and visitors, can now enjoy flash while not worrying about whether or not search engines will be able to crawl them.

If you want to learn more about Google and Flash, you can read about it here: http://googlewebmastercentral.blogspot.com/2008/06/improved-flash-indexing.html.

No comments: