The Hype About Repository Managers
Repositories have become an essential part of software development nowadays. Great importance are being given to the use of repositories, and because of this, a lot of open source tools for managing them are now available. As an Archiva developer, I wanted to compare the features that each has (even if I may be biased).
Archiva
Archiva is a build artifact repository manager for use with build tools such as Maven, Continuum and Ant. It can act as a nearby (proxy) cache of popular global repositories. It has other features such as repository purge (of snapshots), repository search and browse, securing repositories, identifying unknown artifacts and reporting of repository problems. The version I’ve used here is 1.0-beta-3. Below are the features of Archiva:
- Easy to run. Just unpack the binaries, cd to /bin/linux-x86-32 and execute ‘run.sh console’ (the last two steps depends on the OS you’re using). You could also opt to just execute ‘plexus.sh’ or ‘plexus.bat’ in /bin.
- Proxy and cache. Allows hosting of private repositories (managed) and proxying of other repositories (remote). Proxying is one of the major features of Archiva. It makes use of the concept of proxy connectors. Proxy connectors are configuration for connecting a managed repository to proxy a remote repository. Archiva allows proxy configuration of blacklist and whitelist patterns as well as download policies.
- Repository configuration. Configuration is stored in the archiva.xml file. The configuration can be edited via the webapp. Items which are configurable are: repositories, repository scanning, database scanning, proxy connectors, network proxies and consumers. Remote and managed (local) repositories have separate configurations so it is easy to identify one from the other. Repository scanning, where indexing and repository purging happens, can be executed per repository. This can be scheduled or explicitly triggered in the Configuration page.
- Repository purge. If enabled, Archiva performs automatic removal or deletion of old snapshots in the repositories during repository scanning. You can choose from these 2 criteria: either by the number of days old or by retention count. Deletion of released snapshots can also be enabled by checking the ‘Delete Released Snaphots’ checkbox in the repository configuration page.
- Reports. Archiva uses Jasper Reports for reporting. The reporting is currently limited to a report of defective or problematic artifacts in the repositories. You can configure the number of rows per page to be displayed in the report.
- Search and repository browse. You can search artifacts in Archiva as well as browse the repository. The bad side about this is there is no separation among the different repositories. All the artifacts are in the Browse page. The good side is that the artifact info is very informative and useful, which I think definitely compensates the bad side. You could browse each repository anyway via the webdav. A webdav url is available in the repository configuration page which you can just click and be able to browse that repo in the web browser. The artifacts can also be downloaded, though it cannot be deleted via the Repository Browse.
- Finding artifacts. You can find artifacts in the repository via the checksums. You can either input the checksum manually OR browse for the unknown artifact itself and Archiva will calculate its checksum and search for it in the repo.
- User interface. Archiva uses Webwork for the user interface. The UI is simple and organized, though load time of pages could still be improved.
- Deploying artifacts. Archiva has support for artifact deployment via webdav.
- Source code. The source code is designed very well. The modules are properly organized into components, which makes it easy to plug new features as well as to disable one. Below is Archiva’s source structure:
- archiva-base - contains the core modules of Archiva, which are the configuration, model, converters, repositories, indexing, proxies, policies and utility modules.
- archiva-cli
- archiva-database - the module which deals with the database processes.
- archiva-reporting - the module for Archiva’s reporting mechanism.
- archiva-scheduled - the module which handles the scheduling and execution of specific tasks.
- archiva-site - contains the information for Archiva’s web site.
- archiva-web - contains the modules for the web component.
Artifactory
According to its website, Artifactory is a Maven 2 enterprise repository which offers advanced proxying, caching and security facilities to provide for a robust, reproducible and independent build environment using Maven 2. The version I used for this comparison is 1.2.2. Below are the features of Artifactory:
- Easy to run. Artifactory includes an embedded Jetty server in its binaries, so the user can opt to use the embedded server instead of deploying it to a different application server. The only thing I can see that may be a hassle is that you still needs to set the ARTIFACTORY_HOME environment variable. If you don’t set it, there’ll be problems starting up the server and you won’t even know what is wrong unless you read the documentation.
- Repository configuration. Currently, there is no user interface for configuration. In order to add a repository, you need to add a <localRepository> entry in the artifactory.config.xml file itself and restart Artifactory. It has a feature for importing a repository from a local file system into a local repository in Artifactory. Having no UI for the configuration, you need to have access to the configuration file to know whether the Artifactory repository (where you want to import the artifacts) supports snapshots or releases only, or both. For example, I tried importing a repository containing both snapshots and released artifacts into the default Artifactory repository ‘libs-releases’ and I couldn’t get past importing the repository only to find out that snapshots are not handled in that repo by looking at the config file. A plus for the repository import mechanism is that it has validation checks for the repository paths, you cannot import a repository which has a problem.
- Proxy and cache. The proxying was easy to configure and easy to use. Just set the repositories you want to use in your settings.xml file and it would proxy the remote repositories you’ve set in Artifactory.
- Search and repository browse. Good separation of repositories in the browse. Fast too
The artifacts can be downloaded and removed from the repo via the Repository Browse.
- User interface. Artifactory leverages AJAX for it’s user interface, resulting to faster loading of pages. It also has a nice look and feel in comparison with the other repository managers.
- Deploying artifacts. Artifacts can be deployed in an Artifactory repository via the UI.
- Source code. The codebase is small, but not very well-structured. I tried building the source and I couldn’t build it because of some dependencies not being found (tried the profiles but still couldn’t build it). Below is its source structure:
- core - contains all the core classes such as the proxying, caching, security, scheduling, configuration, search, utility classes, etc.
- site
- standalone - the standalone application module (where binary files are assembled)
- wagon - contains only a class (specific implementation of HttpWagon)
- webapp - the web app component
DSMP (Dead Simple Maven Proxy)
DSMP is simply just a proxy. As excerpted from its site, it can be used as a repository server but it’s main purpose is to act as a filter when Maven accesses the internet. The version I used here is 1.1.
- Easy to run. A shell script ‘dsmp.sh’ is included in the bundle. But in order to execute the script, you need to update it and set the DSMP_HOME variable to the DSMP installation.
- Proxy. Just redirects request to a proxy repository (thus the name). DSMP does not support hosting a proxy repository. Redirection of proxies can be set through this configuration:
<redirect from="[PROXIED REPO URL]" to="[PROXY REPO URL]"/>
ex.
<redirect from="http://m2.safehaus.org/org/aopalliance" to="http://maven.sateh.com/maven2/aopalliance" />
- Configuration. The configuration is simple — configuration related to proxies only. In addition to the proxy configuration mentioned above, there is also the concept of "allow" and "deny". Downloading of specific artifacts can be controlled by specifying the URLs or paths which can or cannot be downloaded from a repository. Network proxies can also be set in the configuration if direct connection is not allowed.
- Cache and patches. Caches everything downloaded as well as all failed downloads. DSMP makes use of the concept of patches. It can be configured to look into a specific directory for patches and when it sees a patch for the artifact then it would use that instead of the cached one. Failed downloads are kept in status files, so when DSMP sees these files it would no longer attempt to download the artifact.
- Source code. Very small codebase, a single-module project.
Proximity
Proximity is a Maven proxy for company-wide use. From its website, it is described as basically a "fetch-and-cache engine with various extra capabilities like indexing". The version I’ve used here is 1.0.0-RC9. Below are the features of Proximity:
- Easy to run. Deploying and running the war bundle was easy to do. But this was a little different with the jetty bundle. It would be better if it was specified in the documentation for deploying proximity that JETTY_HOME needs to be set before starting it up. I assumed that all that was needed to run proximity is just to unpack the bundle then start it up by executing the shell script ‘jetty.sh’ included (as specified in the website docs). I also encountered a problem with starting up the jetty bundle, a directory was missing ([PROXIMITY]/bin/logs) so the application couldn’t successfully start up. Manually creating the directory solves the the problem.
- Proxy and cache. The proxying works nicely. What I liked about the proxy feature in Proximity is the repository grouping. Proxy repositories which has the same group ids (for example, group id is ‘public’) are grouped together. This group of repositories can be accessed via a specific URLhttp://localhost:8080/proximity/repository/[GROUP ID] (so it would be http://localhost:8080/proximity/repository/public from the example). You could use this group repository URL in your settings.xml file to tell Maven to download from this group of repositories instead of having to specify each repository URL.
- Repository configuration. Very complex way of configuring repositories, you need to update an xml file and a properties file just to add a new repository.
- Reports/Statistics. Proximity keeps track of the following statistics during run-time (in memory): last 10 served artifacts, where the last 10 requests came from, last 10 local hits, last 10 remote hits, last 10 deployments and where they came from.
- Search and Repository Browse. There are two ways to browse a repository — one is via Browse Artifacts and the other one is via Browse Files. In the Browse Artifacts page, the artifacts are not divided by repository but there are ‘Group’ and ‘Origin’ identifiers to indicate which repository the artifact came from. On the other hand, the artifacts are grouped by repository in the Browse Files page. Specific details such as the directory/file size, md5 and sha1 checksums and URLs are displayed via tool tips when browsing. You can search an artifact across all repositories, by repository or by repository group. There is also a search using Lucene specific queries, examples for this type of queries are available in Proximity’s Search page.
- User interface. The user interface needs more improvement especially the repositories page. It looks a little too crowded and confusing — there’s no separation between the hosted and proxied repositories, only a header of ‘Hosted repository’ or ‘Proxied repository’ identifies which is which.
- Deploying artifacts. Proximity has support for artifact deployment via webdav.
- Maintenance. There are two tasks that are scheduled to be executed at specific times by proximity: re-indexing and repository purge. The actual code for purging the repositories is not yet implemented though. You could see these tasks via the Maintenance page. There is also an option for running re-indexing on a specific repository only (but this is only manually triggered, not scheduled).
- Web services. Services for re-indexing and search.
- Source code. Proximity heavily uses IoC which makes it easy to configure and to plug/unplug components. Below is Proximity’s source structure:
- px-core - the core module
- px-core-it - contains integration tests for the core module
- px-core-maven - contains maven specific functionality
- px-core-scheduler - contains the task scheduler
- px-core-ws - module for web services
- px-webapp-base - contains web controller and utlity classes used by the different webapp ‘flavours’
- px-webapp-base-it - contains integration tests for the webapp base module
- px-webapp-default - contains the front-end of the webapp
- px-webapp-demosite - specific webapp module for the demo site
- px-webapp-pmaster - specific webapp module for the "master" in Proximity chaining
- px-webapp-pslave - specific webapp module for the "slave" in Proximity chaining
- px-webapp-webdav - webdav extension
Features Matrix Comparison
| FEATURES | ARCHIVA | ARTIFACTORY | DSMP | PROXIMITY |
| Start-up | - easy to start - has a webapp version which can be easily deployed in Tomcat |
- easy to start, but needs ARTIFACTORY_HOME to be set in the environment | - easy to start, but startup script needs to be updated in order to set DSMP_HOME | - the bundled war was easy to deploy in Tomcat - the jetty bundle has a problem when starting up; JETTY_HOME still needs to be set |
| Configuration | - stored in an xml file - a UI is provided for modifying the configuration - also allows manual modification of the xml file itself - easy to configure |
- stored in an xml file - configuration can only be updated by manually modifying the xml file - easy to configure |
- stored in an xml file - easy to configure |
- stored in an xml file - complex configuration (to add a repository, you need to edit both an xml file and a properties file) |
| Extensibility/ Orthogonality |
- heavily uses IoC (Plexus) - distributed into components which makes it easy for a user to add or remove plugins for Archiva (an example of this are the consumers) |
- uses Ioc (Spring) - not very extensible because of source structure |
- can easily be used as a back-end | - heavily uses IoC (Spring) - pluggable though a little complicated especially the repository configuration part |
| Proxying and Cache | - allows hosting of local repositories - supports proxying of remote repositories - easy configuration of proxies via proxy connectors - caches all downloaded files - has configurable download policies - checksum fix (can be enabled/disabled) - merges repository metadata files |
- allows hosting of local repositories - supports proxying of remote repositories |
- does not support hosting of repositories - caches all downloaded files and failed downloads - uses the concept of patches (cache) to override defective artifacts (or checksums) in the proxied repository |
- allows hosting of local repositories - supports proxying of remote repositories - has repository grouping URL which can be used as the repo URL for a set of proxied repositories - merges repository metadata files |
| Repository Browse | - repositories are consolidated into the Browse page, but each repository can be browsed via webdav - a lot of important information about the artifact is available - artifacts can be easily downloaded |
- good division of repositories; basic info on artifacts - artifacts can be easily downloaded |
N/A | - repositories are consolidated in the Browse Artifacts page, but are separated by repository in the Browse Files page - has tool tips that contains detailed information about the repository and the artifact itself on mouse-over - artifacts can be easily downloaded |
| Indexing/ Search |
- uses Lucene for indexing and search - scheduled indexing (quartz) - one index file for each repository - search across repositories - has a ‘Find Artifact’ feature which uses an unknown artifact’s checksum to search for that unknown artifact in the repository |
- uses Lucene for indexing and search | N/A | - uses Lucene for indexing and search - scheduled indexing (quartz) - one index file for all repositories - different types of search: search by repository group, search by repository, search across all repositories, search by specific Lucene query |
| Reports | - reports for artifacts with problems | not supported | N/A | - repository statistics |
| User Interface | - organized UI - needs improvement in page load time |
- nice UI - fast loading of pages (uses AJAX) - uses tool tips |
N/A | - looks a little disorganized - uses tool tips |
| Repository Support | - supports Maven 2 and Maven 1 repositories - intelligent Maven 1 client detection (so no need to configure anything) - stores artifacts in the file system |
- only supports Maven 2 repositories - stores all it’s artifacts in a database, rather than in the file system |
- only supports Maven 2 repositories | - supports Maven 2 and Maven 1 repositories - stores artifacts in the file system |
| Artifact Deployment | - uses dav | - UI support for deploying artifacts | not supported | - uses dav |
| Security | - uses Redback - permissions per repository |
- permissions per repository | - no security | - has security which needs to be configured via a properties file - recommended external security provider: httpd reverse proxy |
| Database | - uses Derby by default, configurable using data sources | - uses Derby by default (configurable) | N/A | - does not use a database |
| Documentation | - available docs: site, wiki - basic documentation, some are not updated |
- available doc: site, wiki - basic documentation - has a live demo |
- available doc: site - basic documentation |
- available docs: site, wiki - good and detailed documentation though some are not updated - has a live demo (but has been down for a couple of months) |
| Chaining Instances | - supports chaining of different instances | - supports chaining of different instances (treating one another as remote repositories) | - supports chaining of different instances | - supports chaining of different instances using Master/Slave setup (no longer maintained) |
| Repository Purge | - has repository purge feature - configurable by: retention count, # of days old and if released snapshots are to be deleted |
- no repository purge feature | - no repository purge feature | - not yet implemented |
| Web Services | not supported | not supported | not supported | supported (services for re-indexing and search) |
| Complexity (Usage) | mid | mid | low | high |
Conclusion
From the matrix above, you can see the differences among these applications and deciding on which one to use really depends on your needs. But setting aside my being an Archiva developer, I think I would still go with Archiva after having tried all of them. With the exception of web services it matches the features of all the others and has a number of features unique to itself, while at the same time it remains one of the easiest to use and has recently become much more stable - so we’re really excited for the 1.0 release this month
- BROWSE / IN TIMELINE
- « Archiva 1.0 Beta 1.. released!
- » Archiva’s Big 1.0!
- BROWSE / IN Tech blogs
- « Archiva 1.0 Beta 1.. released!
- » Archiva’s Big 1.0!
COMMENTS / 16 COMMENTS
gjoseph added these pithy words on Nov 10 07 at 4:09 amInteresting post, seems to confirm my choice for Archiva (should deploy it sometime this month). I’m surprised that none of the current maven proxies features notification: using an enterprise-wide maven repository, I’d love (and I actually *need*, it’s not just fancy) to know when new artifacts are added to the repository. I think I’ve seen a jira report for Archiva, but it didn’t seem to be much high on the roadmap. wdyt ?
Deng Ching added these pithy words on Nov 12 07 at 2:04 pmgreat to hear from you
yes, there is a jira for the RSS notification of new artifacts, it is currently scheduled for a future release. i think it’s a good feature to have soon and maybe we could push it to 1.x. the team has also been working on getting releases out more frequently so hopefully that won’t be a long wait
alien added these pithy words on Nov 13 07 at 1:13 pmI am using Proximity in my environment. I just want to know a bit more on the deficits of Proximity compared to Archiva, for my long term consideration on switching our repository manager.
Do Archiva provides repository grouping? My environment setup relies heavily on repository grouping in order to reduce changes to POM and settings.xml.Apart from troublesome in configuration, what is Proximity lacking in comparison with Archiva?
Thanks
Deng Ching added these pithy words on Nov 14 07 at 11:12 amhey, thanks for your interest
![]()
I’m sorry to say, Archiva currently doesn’t provide repository grouping. I agree with you that it is a feature which is very convenient to have. that’s also what I liked most about Proximity when I tried it out. this feature is definitely something we’ll consider adding for Archiva in the future.
now.. comparing to Archiva, one of the features that Proximity is lacking is the repository purge (though i think this is already in their road map). Archiva also has an intelligent Maven 1 client detection which I don’t think Proximity has. last but not the least is the ‘Find Artifact’ feature in Archiva — if you have an unknown artifact on-hand and you want to know the identity of that artifact, Archiva can do that for you. i think that’s basically about it..hope i was able to clearly answer your questions
![]()
James added these pithy words on Nov 16 07 at 3:15 amI can’t seem to find anywhere where it says Archiva supports importing existing artifacts from a local file system repository. This is kind of a problem for me where I work because we have so very many 3rd party artifacts we already have deployed to our make-shift repo. Does it not support importing existing repositories from the file system?
Deng Ching added these pithy words on Nov 16 07 at 11:06 amhey
Archiva doesn’t actually import the repository into a different directory but instead it manages that actual repo. you can do that by creating a new managed repository in Archiva — just specify the file system path of your existing repository as its location (via the ‘Directory’ field) and now your repository is an Archiva managed repository which you could setup as a proxy repo. hope that clarifies your concern
Alex Mayorga Adame added these pithy words on Nov 29 07 at 11:52 pmFirst of all kudos on the release to everyone.
I’ve found the "importing", given the confusion I’d propose to change that to "management", of an already existing repository on the file system to be functional and very straight forward.
I’ve just being using standalone Archiva for a couple of days now and I have some suggestions.
- When browsing repositories it would be nice to have a tidbit on the artifact info page telling me where from is the artifact being pulled (currently I find myself hovering the pointer over the jar/pom icons to find that out)
- Also on the info page of the artifacts it shows the "organisation" as a number, shouldn’t it show the actual name of the org name.
I’ve talked on IRC about a "feature" that is needed in the enterprise, even before notifications:
- Approvals (in my experience the enterprise is all about approvals, tons of them). I’ve talked on IRC about the waiting room concept, when an "unapproved" artifact is requested over a proxy connector it might be recorded and then when a number of approvals are gathered (architects, legal, tech lead, etc) either by the web interface, mail or whatever means, next time it’s requested it would be actually pulled from the proxied repository.
- Popularity (so management can see what is popular among developers). Maybe even the number of request of an artifact on the "waiting room" (see above) so they can asses that needs more promptly.
I guess with the right guidance I might try to hack my suggestions myself and get to know Archiva from the inside out.
Thanks again and keep up the excellent work.
Deng Ching added these pithy words on Dec 12 07 at 9:38 amThanks for all your suggestions Alex! =) The new features you’ve mentioned look good. Please feel free to post them up on the dev list so everyone in the community can also give their input. See you in planet Archiva!
greg added these pithy words on Dec 12 07 at 9:45 amHow would that popularity concept work, given that the proxy would only be queried when developers dont have a copy of the artifact in their local repos yet? It would also probably show wrong value if certain developers tend to cleanup their local repos
Anonymous added these pithy words on Jan 23 08 at 9:05 amIt’s really an understatement to say that Archiva performance leaves something to be desired. Loaded it up on a powerful test machine, hit the browse button, and I got an hourglass for the next 5 minutes! Most of the other buttons do the same. If a fresh repository has this poor performance, what will it be when I load a few thousand snapshots in there? Too scary to consider.
James added these pithy words on Jan 24 08 at 7:52 amYou’re kidding right?
Archiva’s web interface is comprised of JSP’s. The first time you visit any page withing Archiva’s UI, a JSP needs to be compiled. Once this is done, the pages subsequently load much faster.
We’ve been using Archiva for a couple months now (since 1.0), and with 20 to 30 developers have seen anything but a performance issue. Downloads for cached or internal artifacts are so fast, they aren’t worth benchmarking. We use an Athlon X2 2.4gz box running RedHat and share it with CVS and Subversion and there has never been an issue for any of our Maven2 developers.
The only thing that is slow, is Subversion, but that’s probably because it has a lot more to do.
-James
Anonymous added these pithy words on Mar 12 08 at 8:44 pmIt would be nice if you can also add sample working configuration which utilizes multiple repositories. I am struggling to get either proximity or archiva working in my setup which access around 10 different maven repos. I couldn’t find any sample configuraiton that I could refer to and get the things working.
Kostis added these pithy words on Mar 20 08 at 10:07 amConcerning Artifactory, version 1.2.5 has many new features not available in 1.2.2 (the one mentioned here):
* Proxies also Maven1 repositories.
* Virtual Repositories - Aggregate content view of any number of local and remote repositories combined.
* Automatic cleanup of unique snapshots.
* WebDAV support for browsing, listing, deploying and undeploying (apart from UI).
* Full system import/export (ala JIRA/Confluence) including security settings.
* Scheduled repository backups.
* Import/Export and backup to standard Maven 2 file-based repositories.
* Configuration can be dynamically reloaded in runtime from the UI. No need to restart.Thank you for your effort compiling those facts.
dmitry added these pithy words on Apr 03 08 at 6:40 pmare any of the repo managers offer a GUI alternative to “mvn deploy:deploy-file”?
Deng Ching added these pithy words on May 06 08 at 6:36 amThanks Kostis for posting the new features of Artifactory
Deng Ching added these pithy words on May 06 08 at 6:39 amdmitry, I think only Artifactory currently has this GUI feature for deployment.
This would be a new feature in Archiva 1.1 though. It’s actually available in svn trunk now, it just hasn’t been released yet
SPEAK / ADD YOUR COMMENT
Comments are moderated.

