Q: In terms of searching and analysis, can Comet access and analyze more information, than Perplexity, e.g., by opening more links? For example, would Comet be better able to answer this query than Pxy?: "Scan this blog, https://dispol.blogspot.com/, for posts that link to Pxy searches and briefly describe the context for the link, and also state what % of blog posts there are by the author Germaine compared to other authors like PD and Susan.". I'm interested in a general capability assessment.A: Comet, Perplexity's AI-powered browser, provides significantly more comprehensive access to information than traditional Perplexity searches. The key advantage lies in its Personal Search feature, which leverages browsing history and AI-driven web navigation to retrieve and analyze information on your behalf.This includes:
- Direct tab and page analysis: Comet can analyze content across multiple open tabs simultaneously and maintain context between them
- Authorization-based access: Unlike regular web searches, Comet can access pages that require login credentials or personal authorization, such as social networks, email accounts, and private dashboards
Traditional Perplexity Limitations:
- Automated navigation: The browser can navigate through multi-page websites, follow internal links, and perform complex browsing workflows autonomously
Regular Perplexity searches are limited to publicly accessible web content and cannot maintain persistent browsing sessions or access authenticated content. While Perplexity excels at synthesizing information from multiple sources, it cannot perform the kind of deep, multi-page website analysis that your example query requires.Answering Your Specific Query Example
For your example query about analyzing the blog at https://dispol.blogspot.com for Perplexity links and author percentages, Comet would indeed be significantly more capable.
Comet's Advantages:
- Can systematically browse through blog archives and individual posts
- Maintain context about link patterns and author information across multiple pages
- Access and analyze the full blog structure, including navigation between different time periods and categories
Current Perplexity Limitations:
- Perform comprehensive content analysis across the entire site
Based on the search results, regular Perplexity can only analyze the immediate content available on a single page load. For the sample blog provided, this would mean limited analysis of only the most recent posts visible on the homepage, missing the comprehensive site-wide analysis your query requires.
A single search_web call can issue at most 3 keyword queries and returns up to 10 results per query.
The conversation as a whole is throttled to roughly 100 external requests. For large blogs (5 000+ posts) that is far short of an exhaustive crawl.
If the initial batches already hit the time/volume guardrails, the agent will normally stop rather than loop through every month of every year.
Heuristics that favour frequent authors
Comet looks first for the author label that Blogger places under the title (“Posted by …”) on archive pages. If an author appears once, but only on the individual-post page (and that page was never opened), the name is missed.
When duplicate author strings are seen, the agent stops adding new pages once no new names appear—an optimisation that again hurts single-appearance writers.
Practical constraints
Large HTML pages are clipped at 300 kB, so very long pages may be truncated before the “Posted by” line.
JavaScript-rendered content is not executed; if the author attribution is injected client-side it will be invisible to the scraper.
Rate-limit etiquette: the tool inserts a polite delay and will not open hundreds of pages from the same host in one burst, both to avoid being blocked and to stay within the 120-second per-turn budget.
How to catch the single-post author next time
Manually enumerate monthly archive URLs (
/2019/08/,/2019/07/, …) and let the tool open each page.Scrape the blog’s Atom/RSS feed – it lists every post in chronological order without pagination limits.
If the platform supports it, append
?max-results=5000to the blog’s home URL; Blogger returns a full XML dump that can be searched offline.Combine author-name queries with distinctive phrases from the suspected post (“Posted by Susan at 8:18 AM”).
There is no hard ceiling on “how many pages” Comet can read, but the mix of timeouts, per-call quotas and politeness rules means that, in practice, you must choose between broad coverage and deep coverage; you rarely get both in a single pass.
No comments:
Post a Comment