You may not want certain pages of your site crawled because they might not be useful to users if found in a search engine's search results. If you do want to prevent search engines from crawling your pages, Google Search Console has a friendly robots.txt generator to help you create this file. Note that if your site uses subdomains and you wish to have certain pages not crawled on a particular subdomain, you'll have to create a separate robots.txt file for that subdomain. For more information on robots.txt, we suggest this Webmaster Help Center guide on using robots.txt files13.
In March 2006, KinderStart filed a lawsuit against Google over search engine rankings. KinderStart's website was removed from Google's index prior to the lawsuit, and the amount of traffic to the site dropped by 70%. On March 16, 2007, the United States District Court for the Northern District of California (San Jose Division) dismissed KinderStart's complaint without leave to amend, and partially granted Google's motion for Rule 11 sanctions against KinderStart's attorney, requiring him to pay part of Google's legal expenses.
Websites such as Delicious, Digg, Slashdot, Diigo, Stumbleupon, and Reddit are popular social bookmarking sites used in social media promotion. Each of these sites is dedicated to the collection, curation, and organization of links to other websites that users deem to be of good quality. This process is "crowdsourced", allowing amateur social media network members to sort and prioritize links by relevance and general category. Due to the large user bases of these websites, any link from one of them to another, the smaller website may in a flash crowd, a sudden surge of interest in the target website. In addition to user-generated promotion, these sites also offer advertisements within individual user communities and categories. Because ads can be placed in designated communities with a very specific target audience and demographic, they have far greater potential for traffic generation than ads selected simply through cookie and browser history. Additionally, some of these websites have also implemented measures to make ads more relevant to users by allowing users to vote on which ones will be shown on pages they frequent. The ability to redirect large volumes of web traffic and target specific, relevant audiences makes social bookmarking sites a valuable asset for social media marketers.
The Internet and social networking leaks are one of the issues facing traditional advertising. Video and print ads are often leaked to the world via the Internet earlier than they are scheduled to premiere. Social networking sites allow those leaks to go viral, and be seen by many users more quickly. The time difference is also a problem facing traditional advertisers. When social events occur and are broadcast on television, there is often a time delay between airings on the east coast and west coast of the United States. Social networking sites have become a hub of comment and interaction concerning the event. This allows individuals watching the event on the west coast (time-delayed) to know the outcome before it airs. The 2011 Grammy Awards highlighted this problem. Viewers on the west coast learned who won different awards based on comments made on social networking sites by individuals watching live on the east coast. Since viewers knew who won already, many tuned out and ratings were lower. All the advertisement and promotion put into the event was lost because viewers didn't have a reason to watch.[according to whom?]
Webmasters and content providers began optimizing websites for search engines in the mid-1990s, as the first search engines were cataloging the early Web. Initially, all webmasters only needed to submit the address of a page, or URL, to the various engines which would send a "spider" to "crawl" that page, extract links to other pages from it, and return information found on the page to be indexed. The process involves a search engine spider downloading a page and storing it on the search engine's own server. A second program, known as an indexer, extracts information about the page, such as the words it contains, where they are located, and any weight for specific words, as well as all links the page contains. All of this information is then placed into a scheduler for crawling at a later date.