1) Executive Summary
Current state has critical crawl/index/content gaps:
- No SEO metadata system (titles, descriptions, canonicals, OG/Twitter tags) in shared layout.
- No sitemap, no robots strategy, no structured data.
- Important nav links point to missing routes (
/blog,/pricing,/int), causing internal crawl waste. - Multiple thin pages with placeholder text only (
docs,blogs,integrations,components,privacy-policy,terms-of-service). - Public content discoverability is weak: collection papers are mostly hidden behind client interactions and no sitemap.
- Public pages have duplicate URL risk due handle/slug normalization without canonical redirects.
- Public API payload/query patterns can be faster (redundant calls, no pagination, no response caching headers at API layer).
2) High-Priority Findings (Code Evidence)
Critical
- No global metadata framework
- File:
astro/src/layouts/Layout.astro - Only
<title>exists; no meta description, canonical, robots, OG, Twitter, JSON-LD support.
- Broken internal links in home nav
- File:
astro/src/pages/index.astro - Links currently include
/int,/blog,/pricingbut these pages do not exist.
- Thin/placeholder pages are indexable quality risk
- Files:
astro/src/pages/blogs.astroastro/src/pages/docs.astroastro/src/pages/integrations.astroastro/src/pages/components.astroastro/src/pages/privacy-policy.astroastro/src/pages/terms-of-service.astro
- Each file currently contains only 1 line of plain text.
- Soft-404 behavior risk from redirects to
/404
- Files:
astro/src/pages/[handle]/index.astroastro/src/pages/[handle]/p/[projectSlug]/index.astro
- Not-found cases use redirects instead of direct 404 response rendering.
- No sitemap/robots endpoints or config
- File:
astro/astro.config.mjs - No
siteconfig and no sitemap integration. - No robots page/file in
astro/src/pages.
High
- Public project page performs redundant data retrieval
- File:
astro/src/pages/[handle]/p/[projectSlug]/index.astro - Fetches project data, then separately fetches owner profile (
getPublicProfile) even though both are public payload concerns.
- URL duplication risk from normalization without redirect
- Files:
astro/src/pages/[handle]/index.astroastro/src/pages/[handle]/[slug].astroastro/src/pages/[handle]/p/[projectSlug]/index.astro
@handle, uppercase handles, and slug variants resolve to same content but are not canonically redirected.
- Collection paper discovery is weak for bots
- File:
astro/src/components/project/ProjectCollectionsViewer.tsx - Collection papers are fetched only on accordion interaction, limiting crawl discovery without strong sitemap support.
- Public pages hydrate large React islands
- Files:
astro/src/pages/[handle]/index.astro(ProfilePage client:load)astro/src/pages/[handle]/p/[projectSlug]/index.astro(PublicProjectPage client:load)
- Moves more JS to clients than needed for mostly content pages.
- No image SEO baseline
- Files:
astro/src/components/paperCardComponent.tsxastro/src/components/project/PublicProjectPage.tsx
- Several images miss
alt, explicit dimensions, and optimized delivery path.
Medium
- Public API has no pagination for potentially large lists
- File:
fastapi/app/api/v1/endpoints/public.py - Endpoints return full arrays for projects/papers/collections.
- Public API list methods rely on unbounded field scans
- Files:
fastapi/app/services/papers_service.pyfastapi/app/services/projects_service.pyfastapi/app/core/firestore_store.py
find_by_fields(...).stream()without cursor pagination or ordering for public feed endpoints.
- Missing API-level response caching/compression headers
- Files:
fastapi/app/main.pyfastapi/app/api/v1/endpoints/public.py
- No gzip middleware and no public endpoint cache-control/etag policy.
3) Target SEO + GEO Architecture
3.1 Metadata + Canonical System (Global)
Implement a shared SEO props model in layout:
titledescriptioncanonicalrobotsogType,ogImage,ogSiteNametwitterCard,twitterSitejsonLd(array support)
Files to change:
astro/src/layouts/Layout.astro- New helper:
astro/src/lib/seo.ts
3.2 Robots + Sitemap + Feeds
Add:
astro/src/pages/robots.txt.tsastro/src/pages/sitemap-index.xml.tsastro/src/pages/sitemaps/public-pages.xml.tsastro/src/pages/sitemaps/public-papers.xml.tsastro/src/pages/sitemaps/public-projects.xml.tsastro/src/pages/rss.xml.ts(marketing/blog feed)
Use segmented sitemaps for scaling and easier monitoring.
3.3 Structured Data (JSON-LD)
Add JSON-LD by page type:
- Homepage:
Organization,WebSite - User page:
Person,ProfilePage - Project page:
CollectionPageorCreativeWorkSeries - Paper page:
Article+BreadcrumbList - Blog post pages:
BlogPosting
3.4 GEO (AI Search) Layer
Add:
/llms.txt(concise map of high-value URLs + product definition)/llms-full.txt(expanded, machine-friendly knowledge document)- Q&A blocks on key pages (problem -> approach -> examples -> constraints)
- Strong author/entity signals (real author cards, updated dates, source citations)
- Comparison pages and use-case pages with structured, factual answers
4) Public User/Project/Paper Page Upgrade Plan
4.1 /[handle] user page
Current issues:
- Minimal metadata, no Person schema, potential duplicate URLs.
Changes:
- Add unique title/description from user profile.
- Add canonical URL and normalized redirect (
/@name->/name, uppercase -> lowercase). - Add
Person+ProfilePageJSON-LD. - Add server-rendered links to all public content (standalone + collection papers via dedicated pages or sitemap guarantee).
- Keep only small interactive island for tab switching if needed.
4.2 /[handle]/p/[projectSlug] project page
Current issues:
- No metadata/schema.
- Redundant owner fetch.
- Collection paper links are loaded lazily.
Changes:
- Include owner in project API payload; remove extra profile request.
- Add
CollectionPageschema and rich metadata. - Pre-render top papers and collection links server-side.
- Add paginated collection pages if project is large.
4.3 /[handle]/[slug] paper page
Current issues:
- No article metadata/schema.
- Duplicate URL normalization risk.
Changes:
- Add
ArticleJSON-LD andBreadcrumbList. - Add reading-time, updated-at, author link, related papers internal links.
- Add canonical redirect rules for slug normalization.
- Add server-side excerpt generation for description when missing.
5) Pages To Add and Modify
5.1 Must Add (Revenue + Authority + GEO)
/pricing/blog(index)/blog/[slug](marketing posts)/features/use-cases/[segment](at least 4 initial segments)/compare/[alternative](at least 3 initial alternatives)/changelog/about/contact/llms.txt/llms-full.txt/robots.txt- Sitemap endpoints (index + segmented maps)
5.2 Must Fix Existing Routes
index.astronav links (/int,/blog,/pricing) -> valid URLs.- Expand all one-line thin pages or set temporary
noindexuntil complete. - Footer must expose crawlable legal/support links.
- Reserve new root paths in:
astro/src/lib/reservedPaths.tsfastapi/app/core/reserved_paths.py
6) Blog Strategy (Topics + Information Architecture)
6.1 Recommended blog clusters
Cluster A: Programmatic SEO and content operations
- Programmatic SEO fundamentals for API-first CMS
- Building content hubs that avoid cannibalization
- Scaling internal linking with structured content
Cluster B: Developer publishing workflows
- Markdown-first publishing architecture
- Multi-channel distribution automation
- CMS API design patterns for teams
Cluster C: AI search readiness (GEO)
- How LLMs retrieve and cite web content
- Designing pages for AI overview inclusion
- Entity SEO and structured data for developer products
Cluster D: Technical SEO for content-heavy products
- Core Web Vitals for content platforms
- Crawl budget and pagination in dynamic sites
- Canonicalization patterns for user-generated content
6.2 Suggested first 20 posts
Create 5 posts per cluster above, with one pillar page per cluster and 4 supporting posts each. Interlink pillar <-> supporting posts bi-directionally.
7) Where To Store Blog Data (Yes, Database Is Fine)
Yes, you can store blogs in a database. Recommended approach:
Option A (Recommended for your stack): Firestore-backed blog content
New collections:
marketingPostsmarketingAuthorsmarketingCategoriesmarketingTags
marketingPosts fields:
postId,slug,title,excerpt,bodyMarkdownauthorId,categoryId,tagIds[]status(draft|published)publishedAt,updatedAtcanonicalUrlcoverImageUrl,coverImageAltmetaTitle,metaDescriptionogImageUrl
Rules:
- Precompute and store
readingTime,toc,wordCount. - Cache list/detail responses in Redis.
- Serve paginated APIs (
cursor,limit).
Option B: MDX files in repo for marketing pages
Best for editorial versioning and static pre-render speed.
Hybrid recommendation
- Marketing blog/docs pages in MDX (high control, fast builds).
- User-generated papers/projects remain in Firestore.
8) Data Fetching and Speed Improvement Plan
Frontend (Astro)
- Remove redundant API calls on project page by extending one backend payload.
- Convert large public pages to mostly server-rendered HTML with small client islands.
- Avoid loading collection papers only after click if discoverability matters; render crawlable links.
- Add image optimization strategy (dimensions, modern format, priority only for LCP image).
- Add explicit cache policy per route and avoid inconsistent headers.
Backend (FastAPI + Firestore)
- Add paginated public list endpoints:
/public/{handle}?paper_limit=...&paper_cursor=.../public/{handle}/projects/{project_slug}?...
- Add pre-sorted query support in store layer (
order_by,limit,start_after). - Add aggregate cache keys for public profile/project payloads.
- Add response compression middleware.
- Add
Cache-Controland optionalETagon public responses. - Add lightweight list DTOs for cards (avoid large body fields unless needed).
9) Phase-by-Phase Implementation
Phase 0 (Day 1-2): Critical crawl/index foundation
- Build global SEO metadata system in layout.
- Add robots + sitemap endpoints.
- Fix nav links and route mismatches.
- Decide canonical host and enforce HTTPS/non-www policy.
- Replace 302-to-404 pattern with proper 404 responses.
Success criteria:
- Every indexable URL has unique title + description + canonical.
- Sitemaps live and robots references them.
Phase 1 (Day 3-5): Public page SEO + schema
- Implement metadata + JSON-LD for user/project/paper pages.
- Normalize URLs with redirect rules.
- Improve heading structure and on-page content snippets.
- Add related-content internal linking on paper pages.
Success criteria:
- Rich Results validation passes for article/profile pages.
- No duplicate URL variants in crawl exports.
Phase 2 (Week 2): Content and GEO expansion
- Launch
/blog,/pricing,/features,/use-cases,/compare. - Publish first 20 posts across 4 clusters.
- Add
/llms.txt+/llms-full.txt. - Add author pages and E-E-A-T elements.
Success criteria:
- Search Console indexed pages grows steadily.
- AI assistants can retrieve clean product definitions and citations.
Phase 3 (Week 3): Performance and scale
- Add pagination and caching for heavy public endpoints.
- Reduce hydration JS on public pages.
- Introduce query-level optimization in Firestore access layer.
- Add monitoring dashboards and SLOs.
Success criteria:
- Lower TTFB and faster LCP on public pages.
- Stable response times under larger datasets.
10) Manual Tasks Outside This Project
- Google Search Console
- Verify domain property.
- Submit sitemap index.
- Inspect and request indexing for key new pages.
- Monitor coverage, CWV, and enhancement reports weekly.
- Bing Webmaster Tools
- Verify site and submit sitemap.
- Analytics and monitoring
- GA4 + conversion events for signups and content-to-signup paths.
- Track organic landing pages, CTR, and assisted conversions.
- CDN and hosting
- Ensure Brotli/gzip enabled at edge.
- Confirm caching behavior for HTML vs static assets.
- Editorial process
- Assign author owners per cluster.
- Publish cadence: minimum 2 posts/week for first 10 weeks.
- Quarterly content refresh for top pages.
- Authority building
- Acquire links from developer communities and partner integrations.
- Publish benchmark/case-study posts with original data.
- Brand/entity consistency
- Keep organization name, social profiles, and product description consistent across site and external profiles.
11) KPI Dashboard (Track Weekly)
Primary:
- Indexed pages
- Non-brand impressions and clicks
- Avg position for target clusters
- Organic signup conversions
Technical:
- LCP, INP, CLS for top templates
- Crawl errors and duplicate/canonical issues
- Sitemap indexed-to-submitted ratio
GEO:
- Brand/entity mentions in AI answers
- Citation frequency of your domain in AI outputs
- Referral traffic from AI assistants (when detectable)
12) Immediate Next 10 Engineering Tasks
- Implement SEO prop contract in
Layout.astro. - Add
seo.tshelper to generate canonical/meta defaults. - Create
robots.txt.tsand sitemap routes. - Fix broken nav URLs in
index.astro. - Replace one-line thin pages with real content or temporary
noindex. - Add page metadata + JSON-LD to:
[handle]/index.astro[handle]/p/[projectSlug]/index.astro[handle]/[slug].astro
- Add normalized redirect logic for handle/slug variants.
- Extend public project API to include owner summary in one response.
- Add pagination params to public profile/project APIs.
- Add FastAPI compression and response cache headers for public endpoints.
If you execute Phases 0 and 1 completely, you should see meaningful crawl/index quality improvement quickly. Phases 2 and 3 are where long-term SEO + GEO compounding happens.