an overview of the news api platform

The ZENARK API

Access realtime and historical news from thousands of sources using our news API.

The ZENARK API allows easy integration of news content into your application or project. Dynamically access our near real-time database containing news, press releases and articles from a variety of verified sources. The ZENARK API is feature rich while remaining easy to use. Easily write precise queries that accurately return relevant content.

Our customers are building a wide range of applications including brand / reputation monitoring, AI model training, mobile news applications and trend forecasting. They all have a common requirement for a reliable, robust and trusted news delivery platform.

A Trusted Platform

Powerful API

A simple but flexible API based on open standards

Easily integrated with both legacy and modern systems

Scalable cloud based architecture

Curated sources

Hand curated content carefully selected and monitored for quality and consistency

Not just news ‐ we include government publications, press releases and blogs

Smart processing

Intelligent scheduling ensures almost real-time discovery of news stories

Content filtered for duplicates, spam and adverts

Boilerplate text removed to prevent false positives

Reliable

Not dependent on publisher feeds such as RSS or News Sitemaps

No missing articles

Based on scalable and resilient cloud solutions

Architecture

Our technology has evolved over the last 20 years to become a robust and trusted platform for discovering, indexing and searching news content. It is widely used by thousands of users on a daily basis.

Challenges

Unlike traditional search, indexing news and stories has different challenges. Content must be discovered and indexed as close to real-time as possible.

Accuracy is a key challenge. The platform must accurately identify and retrieve items in a timely fashion without impacting the resources of other web properties. Once the item is retrieved it must be deconstructed to simple plain text by removing boilerplate templates. Meta information such as the title, publication date, article text is precisely extracted from a range of HTML sources. Not all HTML is equal. The structure and standards used in pages can vary greatly.

Our content is selected and regularly checked by hand to ensure a continuing high quality of search results

We use predictive algorithms to determine how our crawlers are scheduled -- this results in near real-time detection and retrieval

Spam and duplicates are automatically removed as part of the verification stage

The text is parsed to accurately extract dates, the title and text as well as remove boilerplate text

Additional information such as region and language is extracted and stored as metadata

The parsed document and the associated meta-data are indexed and made available via our API

HTTP request and JSON response

The ZENARK service uses a standards based HTTP Request with JSON response scheme that is familiar to developers of API based systems

No additional developer tools or skills required
Requests can be submitted at the command line
Additional query parameters allow result filtering

							
	{   
	"ref": "ZZyRdGw5",
    "title": "China believes a mysterious pneumonia outbreak is caused 
    by a new strain of virus from the same family as SARS" 
    "source": {     
        "name": "RTE.ie",  "region": "IE"
        }
    "published": "2020-01-16",  "lang":"en",
	"keywords": 
		["SARS", "covid-19", "Xu Jianguo", "World Health Organization"]
   }
}

Features at a glance

Realtime indexing

Rapid discovery and indexing of articles

Boilerplate removal

Only the article text is indexed

Handpicked content

We carefully choose sources for the index

JSON format response

Use your favorite library to integrate our content

User friendly search

We use a familiar syntax for constructing queries

20 years experience

Our people and technology are not the new kids in town

Reliable

The technology is incredibly reliable and well proven

Duplicate detection

Duplicates and other anomalous articles are automatically handled

AI metadata

Supplemental information is added using AI processing

Simple API

Our API can be accessed at the command line

Multiple languages

We support and detect content in multiple languages

Variety of content

It's not just news .. we index government, blogs and social content

Cookie Warning

The ZENARK API

A Trusted Platform

Powerful API

Curated sources

Smart processing

Reliable

Architecture

Challenges

HTTP request and JSON response

Features at a glance

Realtime indexing

Boilerplate removal

Handpicked content

JSON format response

User friendly search

20 years experience

Reliable

Duplicate detection

AI metadata

Simple API

Multiple languages

Variety of content

Send us a message

the platform

the API guide

the FAQ

about zenark

jobs

contact

Cookie Warning

The ZENARK API

A Trusted Platform

Powerful API

Curated sources

Smart processing

Reliable

Architecture

Challenges

HTTP request and JSON response

Features at a glance

Realtime indexing

Boilerplate removal

Handpicked content

JSON format response

User friendly search

20 years experience

Reliable

Duplicate detection

AI metadata

Simple API

Multiple languages

Variety of content

Send us a message