Searching recent news using Tavily API

Hi Tavily team,

We’re using your API for real-time financial and macroeconomic sentiment analysis. Our setup applies several filters:

  • search_depth="advanced"
  • topic="news"
  • include_raw_content=True
  • max_age_days=14
  • Strict filtering by trusted_domains (e.g., cnn.com, reuters.com, eia.gov, etc.)

Despite these, we’re seeing cases like:

https://www.cnn.com/2024/05/28/climate/energy-grid-modernization-biden

This article is:

  • Outside our max_age_days (May 2024)
  • From a non-relevant section (/climate/), even though we’re targeting oil/energy market narratives
  • Not recent or directly relevant to our query (e.g., “EIA crude oil forecast”)

So my questions are:

  1. Does max_age_days guarantee all results are within that range, or is it treated as a soft filter?
  2. Is there any way to restrict search results to specific sections within a domain (e.g., only /business/ or /markets/ from cnn.com)?
  3. Are there plans to expose more structured metadata like article categories/tags or better date validation on your end?

We’re already post-filtering results using published_date and regex path rules, but we’d like to better understand what to expect from the API response upstream.

Thanks!

Hi there,

Thanks for the detailed message and for using Tavily for your financial and macroeconomic sentiment analysis.

A few clarifications:

  1. It looks like you may be using the wrong parameter. The correct one is days, not max_age_days.
    days: This parameter limits search results to content published within the last X. Also, to filter results by specific domains, please use the parameter include_domains instead of trusted_domains. This ensures that only results from the domains you specify (e.g., cnn.com, reuters.com, eia.gov) are returned.
    More on this here: Search API Reference

  2. Currently, the Search API does not support filtering by URL paths (e.g., /business/ or /markets/) directly.
    However, you can post-filter search results to identify relevant URLs and then use either our Extract or Crawl endpoint to retrieve raw content from those URLs. In particular, the Crawl endpoint supports a select_paths parameter:

select_paths: ["/business/.*", "/markets/.*"]

This allows you to define regex patterns to extract content only from specific paths.
Details here:

  1. At the moment, we don’t expose structured metadata like categories/tags or additional date validation. That said, we appreciate the feedback and I’ll pass it along to our team for consideration.

Let us know if you have more questions or need help setting up a custom filtering workflow!

Best,
May