Getting started with Google Pinpoint: A guide for newsrooms

Pinpoint is a free document analysis tool created by Google as part of its Journalist Studio initiative. Designed specifically for investigative journalists, Pinpoint helps users analyze large sets of documents quickly and efficiently by identifying names, organizations, locations and other entities across various formats, including PDFs, scanned documents, images, email archives, Word files, slideshows and audio files. It functions as a search engine layered over your documents, allowing you to find crucial information

The tool’s integration with Google’s knowledge graph and automatic tagging capabilities can wrangle overwhelming FOIA dumps into searchable, organized archives. For cash-strapped newsrooms dealing with substantial document troves, this tool is an affordable and effective option.

Here’s how to get up and running quickly.

Step 1: Apply for access and set up

Pinpoint is free for verified journalists and academics. Apply through Google’s Pinpoint website with your professional credentials. Most users receive access within a few days.

Your account includes:

  • Up to 100,000 documents per collection
  • Unlimited collections
  • OCR for scanned documents and images
  • Audio transcription
  • Collaborative sharing features with unlimited users

Step 2: Create your first project

Start by creating a new collection to organize your documents. Structure these by topic, investigation, beat or reporter to ensure long-term usability as your archive grows.

Best practices for organization:

  1. Use consistent naming convention
  2. Consider both current and future search needs
  3. Plan for team collaboration from the start

Step 3: Upload your files

Drag and drop files directly into your collection. Pinpoint supports PDFs, Word documents, audio recordings, scanned images and more.

Step 4: Use automatic tagging and search

Pinpoint automatically identifies people, locations, organizations and dates across uploaded documents. Use these entity filters to quickly narrow findings without manual sorting.

The Google knowledge graph enables sophisticated searches — searching for “JFK” surfaces references to John F. Kennedy.

Step 5: Set up collaboration

Share your collections with editors, researchers or collaborators. Team members can make notes, annotate documents together, and maintain different access levels based on their roles in the investigation.

Questions to consider

How large is your document set? Pinpoint excels with large volumes of data such as FOIA dumps, court filings, and email records. The tool provides the most value when dealing with substantial document troves that would be impractical to review manually.

What are your team’s ongoing needs? Consider whether you need document analysis for one-off investigations or as an internal newsroom archive. Pinpoint’s flexibility supports multiple use cases — some newsrooms upload all interviews to search for relevant quotes later, while others digitize handwritten notes to make them searchable and accessible.

How should collections be organized? Establish a consistent organizational structure early. Setting up folders by topic, beat or investigation type will make your archive more useful over time and easier for team members to navigate.

Do you have policies in place for handling confidential materials? Ensure any sensitive documents comply with your privacy standards before uploading to a Google-hosted platform. While Google states it doesn’t train on Pinpoint data and maintains Gmail-level security, avoid uploading documents you wouldn’t be comfortable sending via email.

Is Pinpoint right for your newsroom?

Choose Pinpoint if you:

  • Handle large document collections (FOIA dumps, court filings, email archives)
  • Need powerful OCR (Optical Character Recognition) for scanned or handwritten materials
  • Want free document analysis with Google-level search capabilities
  • Require collaborative features for team investigations
  • Need audio transcription integrated with document analysis

Consider alternatives if you need:

  • Advanced data visualization tools
  • Integration with newsroom content management systems
  • Guaranteed data sovereignty or on-premises hosting
  • Advanced redaction or annotation features
  • Structured data extraction from complex document formats

Pro Tips

Transform unstructured data: Use Pinpoint’s “golden document” feature — a single master document that serves as an annotation template for all of the documents in the same collection — to extract specific data points from similarly formatted documents (tax filings, police reports). These can be formatted into CSV format without coding skills.

Explore Generative AI Features (Beta): Pinpoint’s new AI capabilities can answer questions using only your uploaded documents as context, providing citations to source material. The executive summary feature and suggested questions can help guide early-stage document review. However, these tools work best as research assistants rather than as a final analysis as it still makes significant factual errors. Journalists may be better served at this stage by using Notebook LM or other generative AI tools.

Use Batch Downloading Tools: Browser extensions that download files in bulk can help you quickly build Pinpoint collections from websites containing multiple relevant documents.

Leverage Public Collections: Explore document collections shared by other newsrooms worldwide. Major outlets regularly share document collections that can serve as a jumping-off point for local stories. In 2020, Big Local News shared over 4,000 documents related to local government meeting agendas and minutes.

Streamline Fact-Checking Processes: Some newsrooms have integrated Pinpoint into their fact-checking workflows. Jim Malewitz at Wisconsin Watch explains that some reporters create Pinpoint collections for each story’s fact-checking materials. This allows editors to link directly to specific passages within documents, rather than providing page numbers or relying on search functions, which significantly speeds up the verification process.

Alternatives

DocumentCloud:

  • Free service from MuckRock Foundation with more newsroom-specific features
  • Better annotation and collaboration tools
  • Self-hosted options available

OpenRefine:

  • Free, open-source data cleaning and transformation tool
  • Excellent for structured data extraction from documents
  • Requires more technical skill but offers greater control

Datasette:

  • Free tool for exploring and publishing datasets
  • Great for making document collections searchable and shareable
  • Requires some technical setup but highly customizable

Datashare:

  • Free, open-source tool built by ICIJ for investigative journalism
  • Self-hosted option keeps sensitive documents under your control
  • Advanced graph visualization capabilities for connecting entities
  • Collaborative features designed for team investigations

Written by Z. Waite

Z. Waite is a journalist, researcher, and current graduate student at the UC Berkeley School of Journalism, where they report on artificial intelligence and study the impact of new technologies on the news industry.