22

The codebase I work on is huge, and grepping it takes about 20 minutes. I'm looking for a good web-based source code search engine.. something like an intranet version of koders.com.

The only thing I've found is Krugle Enterprise Edition, which doesn't post its prices... and if you have to ask, you can't afford it.

I'd really prefer a plain old search engine, without a lot of other bells and whistles.

The source is mostly ASP.NET/C# and Javascript.

Ira Baxter
  • 88,629
  • 18
  • 158
  • 311
toohool
  • 1,007
  • 1
  • 9
  • 13
  • Can you explain what exactly the objective is, e.g. your own sourceforge for code, or do you need an extended viewer? Is the primary use search/grep? And what do you expect to "find". I briefly looked at coders and I can;t (really) imagine a use case for a company, hence the question. – Till Sep 19 '08 at 23:50
  • And where do you store your code base? If everybody has a checked-out local copy (as should happen in modern VCSs), it should go pretty fast. If you're working on NFS, you can search the code base only as fast as you can transfer the whole thing over your LAN. – David Thornley Nov 12 '09 at 18:50
  • @David: ... if you insist on reading the text of each file while you search. If you index the files first, you don't need to scan the text and it can be lot faster. See my answer. – Ira Baxter Nov 14 '09 at 20:29
  • You might take a look at a product called http://www.elasticsearch.org/ which is a more general purpose scalable search engine that might also happen to make a pretty decent source search solution. – Norman H Oct 17 '12 at 18:52
  • You may also note that Krugle has a very explicitly free edition which will index up to 1GB of source. Seems like 1GB ought to keep most small teams busy for some time! :-) – Norman H Oct 17 '12 at 19:05

11 Answers11

9

I recommend OpenGrok. There are some other engines, here's a quick review of them.

alanc
  • 3,921
  • 18
  • 24
Mauricio Scheffer
  • 96,120
  • 20
  • 187
  • 273
6

20 minutes is outrageous! I'm working with a million+ line source code base these days and grepping takes a few seconds at most (I use ack). Our home directories are stored on a file server and mounted over NFS, and to speed up grepping we do that while logged in to the file server. I'm not sure how long it takes over NFS, but it's certainly longer.

We also do source control operations while logged in to the file server, for the same performance reasons.

Greg Hewgill
  • 828,234
  • 170
  • 1,097
  • 1,237
  • ack is great. And you could probably throw together a rudimentary web frontend in less than an hour. – Thomas Nov 12 '09 at 18:46
3

On Linux I use the GNU ID Utils These have similar functions to grep but work from an index so they are incredibly fast. You run mkid to create an index and then one of the other utilities such as "gid" which is the ID Tools version of grep to grep across the index. I have a cron job that runs mkid occasionally.

The ID tools work on Windows as well, either with cygwin or as a standard windows program

David Dibben
  • 16,774
  • 6
  • 38
  • 40
2

Lxr works great on big code bases, as proved with the linux kernel. I think it's only for C (you didn't specify the languages used).

tsg
  • 1,947
  • 10
  • 11
1

If you have that much source code, you may need to put a bit of time into setting up a search engine to index it. I would recommend Lucene - its free, its fast, it is is pretty easy to set up a powerful index on any content for anyone with programming experience.

http://lucene.apache.org/

Peter
  • 28,255
  • 17
  • 83
  • 120
  • I was hoping for a nice shrinkwrapped solution. But if we can't find one, we could end up building a search engine around Lucene or similar. – toohool Sep 20 '08 at 00:23
  • Yeah - I'm assuming you really have a ton of code - we deal with ~1Million lines, and find that it can be handled adequately in good modern IDEs (Intelli-J for example) on a powerful desktop as long as things are broken down into modules. – Peter Sep 20 '08 at 00:56
1

Since you're saying 'grepping' I imagine you're not disinterested in command-line solutions.

A tool like ctags will index and search C# and JavaScript codebases (among many others).

What's very neat about ctags is that it can be combined with vim with either the taglist plugin to allow source code browsing or with vim omnicomplete to enable code completion.

mbac32768
  • 11,025
  • 9
  • 31
  • 39
1

I've used cs2project for a while, it's an open source c# code search engine based on Lucene.NET. Unfortunately it's no longer being developed.

Igor Brejc
  • 17,626
  • 12
  • 72
  • 92
1

I have used OpenGrok before and was quite happy with it. Another alternative is:

Gonzui http://gonzui.sourceforge.net/screenshots.html


(source: sourceforge.net)

Glorfindel
  • 19,729
  • 13
  • 67
  • 91
f3lix
  • 27,786
  • 10
  • 63
  • 82
0

See our SD Source Code Search Engine. Language aware and handles many languages (C, C++, C#, Java, ObjectiveC, PHP, VB.net, VB6, Ada, Fortran, COBOL, ...). Takes 2.8 seconds to search across the Linux Kernal (7.3 million lines, 18000+ files).

Because it is language aware, it can ignore langauge elements irrelevant to your search (e.g., ignore comments, formatting and whitespace if you are only interested in an identifier or an expression). It can search inside identifiers, strings and comments. It has a full regular-expression string search option if you really want to do that.

It has been used for systems of 10s of millions of lines of code, and in one case we know about, a system with over a million files.

Ira Baxter
  • 88,629
  • 18
  • 158
  • 311
0

I had a similar problem. I work for a software company where the project involves c#, c++, asp.net, db scripts and even vb6 source code (yeah it is a headache compiling multiple vb6 projects when there is no concept of solution like in later version of visual studio...)

I have been using Visual Studio 2010 but had to use 3rd party text editor to search in db scripts and vb6 source code.

I did some research and found KodeEx (http://kodeex.com) and have been happy withit. It is an index based source code search tool. You don't have to build anything (like other people suggested you do with Lucene. Lucene is a nice open source project by the way =) ). Just install it and let it index your projects. After that it usually returns result within a few seconds.

user156144
  • 2,077
  • 2
  • 26
  • 40
-1

Perhaps you should invest some time and/or money in an editor or IDE that supports symbol tagging. You only need to make one pass through the entire source tree to tag it, and thereafter the editor uses an index search or map lookup to find the symbol definition or references.

Some examples of editors or IDEs that support tagging are Eclipse, Visual Studio, SlickEdit. Some IDEs might call the feature Symbol Browser or something similar.

shoover
  • 3,028
  • 1
  • 29
  • 38
  • Would that approach work with uncompiled code, like ASPX or Javascript files? Would code comments be searchable? We really need a full-text search. – toohool Sep 20 '08 at 00:03
  • Wow, still receiving downvotes after 4.3 years. If I were writing this answer today, I would change the tone. When I reread it now, the original answer sounds a little preachy. – shoover Jan 05 '15 at 18:03
  • These days I use Sublime Text, which has syntax highlighters for every language I've used in the past year (Java, Groovy, Clojure, Javascript, CSS, Haskell, R), plus a healthy community of users and plugin developers. It can also highlight ASP, and someone has helpfully provided a tweak for [ASPX](http://myfreakinname.blogspot.com/2013/06/adding-aspx-to-sublime-text-2s-syntax.html). The full-text search (yes, comments too) is super fast and you can search through multiple projects from within the editor. I have no affiliation with ST; I'm just a happy paying customer. – shoover Jan 05 '15 at 18:04