WebTango Logo


Automating Web Site Evaluation

Home IconHome

People IconPeople

Tools IconTools

Publications IconPublications

Talks IconTalks

Email IconContact Us



Tools IconAnalysis Tool

We have temporarily disabled the Analysis Tool. Please try again in a few days. We apologize for this inconvenience.

Analysis for One Page

  Content Category Assessments
  Select one or more content category for comparison.
  Community Education Finance
  Health Living Services

Analysis for a Site Crawler Tool Archive

*Archive   URL:
  (directory containing the metrics.input.data and
   scent.input.sorted.data files generated by the
   Site Crawler Tool; e.g., http://www.mysite.com/archive/)
Content Category Assessments
   Select this option if you want to generate quality predictions
   based on the 6 content categories -- Community, Education,
   Finance, Health, Living, and Services. The directory specified
   above needs to contain a file content.input.data with the
   following format.


   Categoryx is one of the six content categories. All pages in a
   site are compared to the specified content categories. The
   Site_Id is the first field in the metrics.input.data file.
Please address all questions and comments to the WebTANGO Team (tango@sims.berkeley.edu)


This tool computes 157 quantitative page-level and site-level measures as well as page and site quality predictions based on our profiles of highly-rated Web interfaces. The measures deal with many aspects of Web interfaces, including text and link elements, graphic and page formatting, page performance, and site consistency. Currently, the tool only supports English Web pages. The following links provide more details about the measures.

Interactive Appendix: PowerPoint (4.7MB), HTML

Summary of Measures: Word (255K), HTML

The tool can be configured to compute measures and quality predictions for one page or for an entire archive created by the Site Crawler Tool. For the latter case, the top level directory that contains the metrics.input.data and scent.input.sorted.data files needs to be specified. Please make sure that both of these files reflect the actual location of downloaded pages on your server.

The tool currently does not support pages formatted with framesets, Flash, applets, etc. The tool also ignores scripts.

When the tool finishes computing the requested data, the data is archived in a gzipped tar file and an email notification is sent to the specified email address. This message contains instructions for downloading and unpacking the archive.

Analysis for One Page

Specify the URL for a page for which you wish to compute measures and generate quality predictions. By default, quality predictions are generated using the overall page and site quality models, the good page cluster models, and the page type models as described on the Profiles page. You may also request quality predictions using one or more content category models.


Please specify an email address for sending a link to the requested data.

Page Type

You can specify a page type to use with the page type quality model. By default, the predicted page type is used with this model. The following page types are supported.

  • Home - main entry pages to a site that typically provide a broad overview of site contents.
  • Link - pages that mainly provide one or more lists of links. Links may be annotated with text or grouped with headings (e.g., yahoo directory, redirect page, or sitemap). This functional type includes category pages (entry pages to sub-sites or major content areas).
  • Content - pages that mainly provide text. This functional type includes reference (e.g., a glossary, FAQ, search and site tips, and acronyms) and legal (e.g., disclaimers, privacy statements, terms, policies, and copyright notices) pages.
  • Form - pages that are primarily HTML forms.
  • Other - all remaining graphical (e.g., splash pages, image maps, and Flash) and non-graphical (e.g., blank, under construction, error, applets, text-based forms, and redirect) pages.

Analysis for a Site Crawler Tool Archive

Specify a URL to the top level directory of the archive generated by the Site Crawler Tool. The metrics.input.data and scent.input.sorted.data files will be used to compute measures and generate quality predictions. Please make sure that these input files are readable by the public.

Similarly to quality predictions for a single page, you can specify quality predictions using one or more content category models. You need to create the file content.input.data in the same directory as the metrics.input.data file on your server. This file associates content categories (Community, Education, Finance, Health, Living, and Services) with each Site_Id (first field in the metrics.input.data file).

Horizontal Rule
Home | People | Tools | Publications | Talks | Contact Us