Homeowner experiences, agent discussions, E&S/surplus lines, and FAIR Plan coverage in California wildfire zones
| Source | Status | Started | Completed | Posts | Cost | Errors |
|---|---|---|---|---|---|---|
| topic-discovery | completed | 3/28/2026, 5:53:19 AM | 3/28/2026, 5:53:56 AM | 0 | $0.12 | — |
| Source | Status | Started | Completed | Posts | Cost | Errors |
|---|---|---|---|---|---|---|
| api-discovery | completed | 3/28/2026, 5:21:58 AM | 3/28/2026, 5:22:00 AM | 10 | — | — |
| Source | Status | Started | Completed | Posts | Cost | Errors |
|---|---|---|---|---|---|---|
| api-discovery | completed | 3/28/2026, 5:21:28 AM | 3/28/2026, 5:21:45 AM | 11 | $0.04 | — |
| Source | Status | Started | Completed | Posts | Cost | Errors |
|---|---|---|---|---|---|---|
| completed | 3/16/2026, 4:13:13 AM | 3/16/2026, 4:24:51 AM | 262 | — | — |
Marginal value — Tier C adds 262 posts but only 8 high-quality
Scraped 6 Tier C subreddits (hyperlocal fire zones + investor subs). r/Malibu had the best signal-to-noise with FAIR Plan pricing discussion. r/realestateinvesting was fully blocked by Reddit 403 rate-limiting. Total Reddit corpus now ~4,334 posts across 34 subreddits.
Relevance
3%
Location Mentions
16%
Cost/Post
Free
Audience Split
unknown: 106, homeowner: 156
Top Sources
Strengths
Weaknesses
Recommendations
| Source | Status | Started | Completed | Posts | Cost | Errors |
|---|---|---|---|---|---|---|
| completed | 3/15/2026, 5:50:33 PM | 3/15/2026, 6:22:12 PM | 1246 | — | — |
Tier B targeted scrape — 1246 posts from 11 subreddits
Scraped 11 Tier B subreddits (hyperlocal fire zones, real estate, legal, financial planning). Subreddits: r/altadena, r/burbank, r/TahoeLocals, r/santarosa, r/grassvalley, r/RealEstate....
| Source | Status | Started | Completed | Posts | Cost | Errors |
|---|---|---|---|---|---|---|
| completed | 3/15/2026, 8:46:55 AM | 3/15/2026, 9:26:18 AM | 1798 | — | — |
Tier A targeted scrape — 1798 posts from 16 subreddits
Scraped 16 Tier A subreddits using subreddit: search operator with Apify lite actor. Subreddits: r/HomeInsurance, r/InsuranceClaims, r/InsuranceAgent, r/InsurancePros, r/homeowners, r/California....
| Source | Status | Started | Completed | Posts | Cost | Errors |
|---|---|---|---|---|---|---|
| completed | 3/15/2026, 8:41:58 AM | 3/15/2026, 8:44:59 AM | 144 | — | — |
Single subreddit test — 144 posts from r/Insurance
Test run of targeted subreddit scraper against r/Insurance. Validated that subreddit: search operator works with the free Apify lite actor.
| Source | Status | Started | Completed | Posts | Cost | Errors |
|---|---|---|---|---|---|---|
| uphelp.org | completed | 3/15/2026, 7:50:22 AM | 3/15/2026, 7:54:25 AM | 1 | — | — |
Unreliable — only 1 of 4 question pages scraped successfully via Apify. Cloudflare interference causes ~75% failure rate. High-value content but not worth the Apify cost at this success rate.
Weaknesses
Recommendations
| Source | Status | Started | Completed | Posts | Cost | Errors |
|---|---|---|---|---|---|---|
| bogleheads.org | completed | 3/15/2026, 7:49:43 AM | 3/15/2026, 7:50:07 AM | 0 | — | — |
Blocked — 403 even with Apify Web Scraper + residential proxy + Chrome stealth. IP-level blocking, not just Cloudflare. 10 relevant threads identified but cannot be scraped automatically.
Weaknesses
Recommendations
| Source | Status | Started | Completed | Posts | Cost | Errors |
|---|---|---|---|---|---|---|
| mrmoneymustache.com | completed | 3/15/2026, 7:49:06 AM | 3/15/2026, 7:49:28 AM | 106 | — | — |
Go — 106 homeowner posts from 5 FIRE community threads. Financially sophisticated homeowners analyzing insurance economics, self-insuring calculations, LA fire settlements. Unique audience perspective.
Strengths
Weaknesses
| Source | Status | Started | Completed | Posts | Cost | Errors |
|---|---|---|---|---|---|---|
| talkirvine.com | completed | 3/15/2026, 7:48:27 AM | 3/15/2026, 7:48:46 AM | 42 | — | — |
Go — 42 homeowner posts from 3 Irvine/OC community threads. XenForo forum, same parser as insurance-forums.com. Real homeowners discussing premium increases, carrier comparisons, fire zone surcharges.
Strengths
Weaknesses
| Source | Status | Started | Completed | Posts | Cost | Errors |
|---|---|---|---|---|---|---|
| insurance-forums.com | completed | 3/15/2026, 7:04:20 AM | 3/15/2026, 7:15:54 AM | 54 | — | — |
54 new broker posts from 9 remaining threads. Total insurance-forums.com posts now 134 across 19 threads. Apify Web Scraper with stealth successfully replaces Firecrawl.
Location Mentions
NaN%
Strengths
Weaknesses
Recommendations
| Source | Status | Started | Completed | Posts | Cost | Errors |
|---|---|---|---|---|---|---|
| insurance-forums.com | completed | 3/15/2026, 7:03:23 AM | 3/15/2026, 7:03:58 AM | 5 | — | — |
Test successful — Apify Web Scraper with Chrome + stealth + residential proxy bypasses Cloudflare. XenForo HTML parsed via BeautifulSoup. 5 broker posts from 1 thread.
Strengths
Recommendations
| Source | Status | Started | Completed | Posts | Cost | Errors |
|---|---|---|---|---|---|---|
| insurance-forums.com | completed | 3/15/2026, 7:02:06 AM | 3/15/2026, 7:02:43 AM | 0 | — | — |
Failed — Cloudflare blocked Apify Web Scraper without stealth mode. 403 on all retries. Led to enabling Chrome + stealth + residential proxy.
Weaknesses
Recommendations
| Source | Status | Started | Completed | Posts | Cost | Errors |
|---|---|---|---|---|---|---|
| forum-discovery | completed | 3/15/2026, 6:44:57 AM | 3/15/2026, 6:47:43 AM | 0 | $0.75 | — |
| Source | Status | Started | Completed | Posts | Cost | Errors |
|---|---|---|---|---|---|---|
| google-news | completed | 3/15/2026, 6:33:22 AM | 3/15/2026, 6:34:01 AM | 85 | $0.18 | — |
Go — viable tertiary source. 85 news articles collected via SerpAPI Google News. Top sources: CalMatters, Insurance Journal, LA Times. Snippet text captures key information without full-article scraping.
Location Mentions
NaN%
Top Sources
Strengths
Weaknesses
Recommendations
| Source | Status | Started | Completed | Posts | Cost | Errors |
|---|---|---|---|---|---|---|
| nextdoor | completed | 3/15/2026, 6:33:12 AM | 3/15/2026, 6:33:20 AM | 0 | — | — |
no-go
| Source | Status | Started | Completed | Posts | Cost | Errors |
|---|---|---|---|---|---|---|
| completed | 3/15/2026, 6:32:55 AM | 3/15/2026, 6:33:10 AM | 0 | — | — |
limited-go
| Source | Status | Started | Completed | Posts | Cost | Errors |
|---|---|---|---|---|---|---|
| redfin | completed | 3/15/2026, 5:52:29 AM | 3/15/2026, 5:52:43 AM | 0 | — | — |
No-go — editorial articles only, no community discussions
SerpAPI site-search returned 30 results: 10 staff-written blog/news articles about wildfire insurance, 19 property listings, and 1 false-positive blog post miscategorized as community. Zero user forums or Q&A sections. Redfin publishes agent-curated content but has no public community discussion features.
Strengths
Weaknesses
Recommendations
| Source | Status | Started | Completed | Posts | Cost | Errors |
|---|---|---|---|---|---|---|
| zillow | completed | 3/15/2026, 5:52:16 AM | 3/15/2026, 5:52:27 AM | 0 | — | — |
No-go — no community discussion content found
SerpAPI site-search returned 30 results: 6 Zillow Research articles about insurance trends, 19 property listings, and 5 other pages. Zero user-generated forum posts, Q&A threads, or community discussions. Zillow has no public discussion boards — its "forums" are invite-only corporate events.
Strengths
Weaknesses
Recommendations
| Source | Status | Started | Completed | Posts | Cost | Errors |
|---|---|---|---|---|---|---|
| completed | 3/15/2026, 5:05:04 AM | 3/15/2026, 5:08:53 AM | 229 | $0.05 | — |
Viable Tier 2 source. Better geographic specificity than Reddit (38% vs 4%), but content is shallow. Needs relevance scoring and stock spam filtering.
Location Mentions
NaN%
| Source | Status | Started | Completed | Posts | Cost | Errors |
|---|---|---|---|---|---|---|
| completed | 3/15/2026, 4:56:22 AM | 3/15/2026, 5:00:36 AM | 40 | $0.07 | — |
Development run — field mapping adjustment. See final run.
| Source | Status | Started | Completed | Posts | Cost | Errors |
|---|---|---|---|---|---|---|
| completed | 3/15/2026, 4:50:43 AM | 3/15/2026, 4:54:16 AM | 223 | $0.05 | — |
Initial run — author_id mapping was incorrect, data replaced by final run.
| Source | Status | Started | Completed | Posts | Cost | Errors |
|---|---|---|---|---|---|---|
| biggerpockets.com | completed | 3/15/2026, 3:50:03 AM | 3/15/2026, 3:50:27 AM | 10 | — | — |
BiggerPockets investor/landlord perspective. Lower volume but useful for homeowner/investor insurance experience. Direct HTML scraping works (no Cloudflare).
| Source | Status | Started | Completed | Posts | Cost | Errors |
|---|---|---|---|---|---|---|
| insurance-forums.com | completed | 3/15/2026, 3:46:16 AM | 3/15/2026, 3:50:26 AM | 80 | — | 4 error(s) |
100% broker content from insurance-forums.com. Professional discussions about FAIR Plan, E&S markets, carrier exits. Highest quality source. Firecrawl free tier exhausted.
| Source | Status | Started | Completed | Posts | Cost | Errors |
|---|---|---|---|---|---|---|
| completed | 3/15/2026, 2:05:56 AM | 3/15/2026, 2:14:39 AM | 472 | — | — |
Batch 2 — geographic queries, similar noise profile
Second Apify batch using geographic + subreddit-targeted queries (LA wildfire, Bay Area, Sacramento, etc.). 472 new posts, 23 duplicates caught. After cleanup: 884 total Reddit rows (115 posts, 769 comments, 11 deleted). Noise ratio still high — most comments are off-topic. Location mention rate improved to 16% (vs 4% in batch 1) thanks to geographic query focus.
Relevance
30%
Location Mentions
16%
Cost/Post
$0.005
Audience Split
unknown: 779, homeowner: 105
Top Sources
Strengths
Weaknesses
Recommendations
| Source | Status | Started | Completed | Posts | Cost | Errors |
|---|---|---|---|---|---|---|
| completed | 3/15/2026, 12:19:37 AM | 3/15/2026, 12:26:48 AM | 437 | — | — |
Promising but noisy — needs filtering
Reddit has relevant CA home insurance threads, but the scraper pulls ~10 comments per thread and most are off-topic noise. Only 49 of 437 rows are thread-starting posts. 14 rows are [deleted]. Irrelevant subreddits (r/PokemonTCG, r/bouldering) also matched. Data needs relevance scoring and thread labeling before it is useful.
Relevance
30%
Location Mentions
4%
Cost/Post
Free
Audience Split
unknown: 332, homeowner: 105
Top Sources
Strengths
Weaknesses
Recommendations