Questions tagged [google-search]

SEARCH ENGINE OPTIMIZATION(SEO) IS OFF-TOPIC. This tag is only for programming questions about the Google search engine.

Google is the most popular search engine in the world. The Google Web Search API has been deprecated in favor of the new Custom Search.

A Google search may not return answers that might be expected for reasons that include those mentioned in answers and comments to What can you NOT find on Google?:

Google does not even attempt

  • To search for a keywords that are special characters:

"Generally, punctuation is ignored, including @#$%^&*()=+[]\ and other special characters" -Franck Dernoncourt.

The search term double unary works but not --. See also Google displays my website as a spelling error.

Sites with too much content, with content of little value or that are impractical to index

May include:

  • Sites that don't have a crawlable site map and require google to provide search terms to access the results available on the site might not be fully indexed. -Josephine Bonaparte
  • Smaller blogs that aren't regularly updated are often dumped from search results. Plus anything that they think is a splog (“a blog which the author uses to promote affiliated websites” -Wikipedia). -David
  • “Most of the Twitter content is not indexed by Google, even if it’s public.
    It used to be available to Google, but that’s no longer the case since their agreement expired.” -Alex
  • “Google does not index Tumblr all that well.
    Blog posts on Tumblr are easier to find using Tumblr search.” -David
  • “everything on Google Sites isn't (or is hardly) indexed.
    If you start a Google site, get your own domain.” -David

Copyright and other protected material

May include:

  • What the government thinks is not good for you. –Hellagot
    The example give was of Germany “does not show thousands of sites … and the list increases by the thousands every year”.
  • What may infringe intellectual property rights. –einpoklum
    DMCA (Digital Millennium Copyright Act) was mentioned.
  • Census images.
    “Since the content are images that are often manually index, they usually found on paid-for sites like ancestry.com.” –amh

To see which URLs Google has been blocked from crawling, visit the Blocked URLs page of the Crawl section of Webmaster Tools.

Opt outs

  • Content explicitly disallowed by a domain's robots.txt file is excluded from the Google index. -amh

Technical complications

  • Websites that are not linked from other websites that Google already knows (perhaps from when domain was under different ownership – Tim Post). That is, there are probably a lot of websites that do not get linked from visible pages, those websites are never going to be found by the Google spider unless they're manually submitted to Google via the Webmaster Tools. –amh
  • Websites that are behind web forms that you need to fill out. –amh
  • The Deep Web “Most of the Web's information is buried far down on dynamically generated sites, and standard search engines do not find it. Traditional search engines cannot "see" or retrieve content in the deep Web—those pages do not exist until they are created dynamically as the result of a specific search. As of 2001, the deep Web was several orders of magnitude larger than the surface Web.” -Wikipedia
  • May include 408 Billion web pages saved over time according to Wayback Machine. –pnuts
1618 questions
699
votes
15 answers

How do search engines deal with AngularJS applications?

I see two issues with AngularJS application regarding search engines and SEO: 1) What happens with custom tags? Do search engines ignore the whole content within those tags? i.e. suppose I have

Hey, this title is…

luisfarzati
  • 8,548
  • 6
  • 27
  • 27
479
votes
4 answers

How can I use a search engine to search for special characters?

Google strips most special characters from the text they index so it's not a good tool for many troubleshooting-related tasks, such as finding out what the variable "$-" is in perl, or searching for error output that is loaded with special…
jonderry
  • 21,385
  • 29
  • 93
  • 160
374
votes
8 answers

What database does Google use?

Is it Oracle or MySQL or something they have built themselves?
solrevdev
  • 8,303
  • 10
  • 37
  • 47
326
votes
10 answers

What are the alternatives now that the Google web search API has been deprecated?

Google Web Search API has been deprecated and replaced with Custom Search API (see http://code.google.com/apis/websearch/). I wanted to search the whole web but it looks like with the new API only custom sites can be searched. Is there a way to…
Dan
  • 9,093
  • 13
  • 48
  • 69
230
votes
2 answers

How to screenshot website in JavaScript client-side / how Google did it? (no need to access HDD)

I'm working on web application that needs to render a page and make a screenshot on the client (browser) side. I don't need the screenshot to be saved on the local HDD though, just kept it in RAM and send it to the application server later. I…
Paweł Szymański
  • 7,102
  • 6
  • 19
  • 18
136
votes
5 answers

How does Google Instant work?

Any ideas on exactly how the new google instant search works? It seems to just be AJAX calls to the old search, but it's pretty hard to simplify Google that much. Anybody have speculations? EDIT: I know there is AJAX sent with each keypress, but…
DexterW
  • 5,457
  • 5
  • 31
  • 44
135
votes
5 answers

Navigating Google search results using keyboard shortcuts

Some will think this is not related to programming but I think it is, because most of the time when I encounter programming issues I search on Google to find solutions or ways to do what I plan to do before I start writing it from scratch. Let's…
talsibony
  • 7,378
  • 5
  • 42
  • 41
82
votes
10 answers

"Real" link to file in Google search results?

I often search documents (mainly PDFs) using Google. But when I right click the link, or just hang the mouse cursor over it. What I get is NOT the real link, but some thing long and confusing like the…
mayasky
  • 945
  • 1
  • 7
  • 9
72
votes
10 answers

Designing a web crawler

I have come across an interview question "If you were designing a web crawler, how would you avoid getting into infinite loops? " and I am trying to answer it. How does it all begin from the beginning. Say Google started with some hub pages say…
68
votes
2 answers

Looking for special characters in Google

Do you know how to look for special characters with google...? I'm looking at bash code and there's the ## operator. I would like to know what It does but I wasn't able to figure out a way to protect the character (I'm not sure it's even…
LB40
  • 11,121
  • 16
  • 66
  • 104
61
votes
9 answers

Why does the Google homepage use deprecated HTML (ie. is not valid HTML5)?

I was looking at the www.google.com in Firebug and noticed something odd: The Google logo is centered using a center tag. So I went and checked the page with the W3C validator and it found 48 errors. Now, I know there are times when you can't make a…
Adrian Mester
  • 2,513
  • 1
  • 18
  • 21
43
votes
8 answers

How does google do the barrel roll?

If you Google, 'do a barrel roll', the whole page does a 360 rotation. Does anyone have any guesses as to how Google is doing this? I disabled javascript, and it still occurred, so maybe a css rotation?
wave
  • 1,730
  • 2
  • 15
  • 16
39
votes
4 answers

Is there a way to programmatically access Google's search engine results?

Does google offer a way to programmatically see their search engine results for a certain query? I want to build a tracking application so that a user can see what rank on the google results their website is for certain keywords. EDIT: The behavior…
Doug
  • 631
  • 2
  • 6
  • 6
32
votes
1 answer

What is ' and why does Google search replace it with apostrophe?

In what language does and - hash - three - nine - semicolon (') represent the apostrophe? I had some website data extracted in JSON format where some of the user comments had apostrophe which were replaced by '. So, what representation it…
Rajesh Surana
  • 657
  • 1
  • 9
  • 15
28
votes
4 answers

how to force google to re-index a page

A website I've made has been recently hacked and Google indexed that hacked homepage and now its showing irrelevant text on search result. The hack has been resolved but the search results haven't changed. Is there a way to force Google to re-index…
Dominic Mercier
  • 798
  • 2
  • 8
  • 17
1
2 3
99 100