Uncovering AI
Posts
📸 Kimi K2 ships code power at open-source prices

📸 Kimi K2 ships code power at open-source prices

Kimi K2 ships code power at open-source prices—alongside DH3’s editor flex, Meituan’s 560B LongCat, on-device EmbeddingGemma RAG, Claude’s file creation, 3D/voice upgrades, and safety insights.

September 10, 2025

Instagram | Sponsor this newsletter | Let’s Connect!

My fellow AI explorers

If AI had a sleep button, someone snapped it off this week. Open-weights are clocking SOTA coding, ByteDance’s “mystery” image model is dunking on Nano-Banana, Meituan (yes, the food-delivery giant) shipped a 560B MoE, and 3D/real-time worlds just leveled up.

In today’s edition:

🧠 Kimi K2 (0905): open MoE that can code with the best
🖼️ DH3 vs. Nano-Banana: the stealth image editor flex
🐱 LongCat-Flash (560B): Meituan’s shock-drop MoE
🌍 World builders: Tencent’s HunyuanWorld-Voyager, ReconViGen few-shot 3D, and Oasis 2.0 real-time worlds
🔊 Voice: Chatterbox Multilingual + the VibeVoice takedown
🤖 Robotics: Figure 02 loads a dishwasher—autonomously

Must See AI Tools

💰 Payman: AI That Pays Humans. Over 10,000+ signed up for the beta
💫 SubMagic: An AI tool that edits short-form content for you! (Get 10% off using code “uncoverai” at checkout)
🎤 11Labs: #1 AI voice generator (Click Here to get 10,000 free credits upon signing up!)
🤖 ManyChat: Automate your responses & conversations on IG, FB and more! (Click Here to get first month for free)
🎙️ Syllaby: The only social media marketing tool you’ll ever need - powered by AI! (Get 25% off the first month or any annual plan with code “UNCOVER” at checkout)

LLM’s

Kimi K2-Instruct-0905: Open MoE, Big Coding Energy

Meet Kimi K2-Instruct-0905 (Moonshot’s open-weights MoE), now with ~32B active params and 256K context, plus fresh instruct tuning for agentic workflows and code. You can try it in the Kimi app or hit the Kimi API.

Why this matters

Dynamic MoE = big-model capability with smaller per-token compute
Long context + tool use make it practical for end-to-end app scaffolds
Community ports (e.g., GGUF) are already popping up

Prediction: Expect a wave of K2-powered vertical code copilots (Next.js, Shopify, Unreal).

🔗 Check out Kimi here

Founders…

🤖 Need an AI Agency to Help Your Business Implement AI Solutions?

Our preferred partner Align AI provides you with an expert AI and Automation implementation team to add 10-40+ hours of increased productivity per employee and achieve your goals faster.

Click here to schedule your company’s complimentary AI and Automation Strategy Plan

AI Risks

Dario Amodei: Entry-Level White-Collar Jobs at Risk + Real-World Misuse of Claude

TL;DR: In a new interview, Anthropic CEO Dario Amodei warns that AI could displace a large share of entry-level white-collar roles within 1–5 years if society “doesn’t handle this well,” and says Anthropic has disrupted real-world attempts to weaponize Claude, from ransomware to North Korean sanctions-evasion job fraud. He argues for transparent disclosure and stronger regulation.

Watch the clip here.

What’s new

Jobs: Amodei reiterates that AI is already strong at document review, admin coordination, basic financial analysis and slide-building—the “workhorses” of junior roles—raising near-term displacement risks. Separate coverage earlier put the estimate at up to ~50% of entry-level white-collar jobs over five years, with unemployment spikes possible.
Misuse & security: Anthropic disclosed that it detected and shut down operations abusing Claude for cyberattacks (including ransomware) and that North Korean operatives tried to use Claude to land remote tech jobs at U.S. firms—now publicly documented in its misuse reports and covered by major outlets.
Lab vs. reality: Some alarming capabilities show up in tests before the real world—reason enough, Amodei says, to publish findings early and push for defenses + policy. Google, for its part, has also detailed ongoing Gemini hardening and safety evaluations against cyber/bio misuse.

Counterpoint

Not everyone agrees on the pace/magnitude: OpenAI COO Brad Lightcap has said they’re not seeing evidence of mass entry-level replacement yet, underscoring how uncertain the labor transition will be.

Why it matters

If entry channels shrink, career ladders in law, finance, consulting and ops could bottleneck—with productivity gains accruing to fewer people unless companies and policymakers re-route on-ramps (apprenticeships, AI-augmented junior tracks, outcome-based credentials). Meanwhile, threat actors are already probing model guardrails, making safety transparency + coordinated takedowns a necessity, not a PR choice.

Prediction (6–12 months)

Expect big firms to redefine junior roles around AI orchestration + verification rather than rote production—and for vendors to ship auditable “safe-use” modes (rate-limited code gen, tighter tool APIs, anomaly detection) as standard SKUs. Regulators will push for misuse reporting and red-teaming disclosures patterned on today’s threat-intel sharing.

Further reading / sources:
Axios on jobs: “white-collar bloodbath?”; Anthropic’s misuse reports; Reuters coverage of cyber misuse takedowns; Google’s Gemini safety hardening; the full interview clip

AI SaaS Founders

🚨Want Millions of Impressions For Your AI SaaS, Done For You?

At uncovernews.co, we specialize in getting AI SaaS products the attention they deserve through strategic influencer marketing campaigns designed to drive millons of impressions at the fraction of the cost!

Get Your AI Startup’s News or Product In Front of Millions Quickly

Google AI

🔍 AI News — Google’s EmbeddingGemma puts on-device RAG on turbo

TL;DR: Google dropped EmbeddingGemma—a 308M-param multilingual encoder that hits top MTEB/MMTEB scores for sub-500M models, runs fully offline (quantized) in < ~200MB RAM, and can produce a 256-token embedding on EdgeTPU in < ~22ms. It’s plug-and-play across the usual ecosystem (Hugging Face, Kaggle, Vertex AI, SentenceTransformers, Transformers.js, Ollama).

Why this is big

Small + fast: 308M params, ~200MB with quantization; EdgeTPU does a 256-token embed in < ~22ms—i.e., crisp UX for mobile/edge.
Quality: Trained on 100+ languages with a 2K token context; best-in-class under 500M on MTEB/MMTEB.
Right architecture: Gemma-3-style bi-directional encoder (not a causal LLM) → better semantic retrieval.
Flexible vectors: Defaults to 768-D, but MRL lets you drop to 512/256/128 without retraining to save space/speed.
Ecosystem ready: Weights on Hugging Face, Kaggle, and Vertex AI; 1-liners for SentenceTransformers; Transformers.js demo runs in-browser; Ollama support is live.

What it means (in practice)
If you care about private, offline assistants or snappy RAG on laptops/phones, this hits the sweet spot: use EmbeddingGemma to find the best passages locally, then hand them to Gemma 3n (same tokenizer) to write the answer—no cloud needed. For devs indexing big corpora on constrained devices, start at 768-D during experimentation, then truncate to 256-D via MRL for production to shrink memory/disk and speed up similarity search—usually with minimal quality loss. Also: the official Gemma Cookbook ships a quick-start RAG notebook to wire this up fast.

Deeper cut (how it works)
This is an encoder with bi-directional attention, so it “reads” the whole input at once and compresses it into a normalized vector (cosine-friendly). It also ships task-specific prefixes (e.g., query vs. document) to improve retrieval accuracy—handled automatically in SentenceTransformers, or add them yourself in other stacks.

Prediction (next 90 days): Expect a wave of on-device RAG apps (knowledge inboxes, field-ops copilots, travel/off-grid agents) that default to 256-D MRL vectors, with Transformers.js browser demos becoming the new “hello world” for embedded search. Enterprise side: more Vertex blueprints pairing EmbeddingGemma for retrieval with Gemini/Gemma writers for generation.

Try it

Model + docs: EmbeddingGemma overview, HF model card, Hugging Face blog.
Run it: SentenceTransformers guide • Transformers.js 3D demo • Ollama library.

(Note on latency: Google’s docs quote < ~22ms on EdgeTPU for a 256-token snippet; some blogs say < 15ms, but we’re sticking to the official figure.)

30-Second AI Play

🎯 Create & Edit Files with Claude

Open Claude (web or desktop). This feature is live for Max, Team, and Enterprise; Pro rollout is next. Start a new chat.
Say what you want—by file type. Example: “Make a one-page .docx memo summarizing this PDF” or “Build a quarterly budget in .xlsx with formulas + a chart.” Claude can create Word, Excel, PowerPoint, and PDFs.
Add your source material. Paste text or upload files (PDF, DOCX, CSV, XLSX, TXT, HTML, JSON, etc.). Then ask Claude to extract, clean, or reorganize the content.
Iterate in plain English. Ask Claude to “insert a column for CAC,” “recalculate with a 12% YoY growth,” or “rewrite slide 3 for execs.” It edits the live file and returns an updated version.
Export. Click Download to get the native file or Save to Google Drive right from the chat.
Power move (optional): Use Claude’s Analysis tool to run quick calculations on your data before inserting results into the file.

Great starter prompt:
“Create an Excel financial model for a DTC startup: inputs tab (traffic, CVR, AOV, refunds), assumptions tab, and monthly P&L with formulas + charts. Then add a one-page PDF summary for investors.”

▶️ See it in action (1-min video): Preview: Claude can create and edit files.
📰 Announcement & how-to: Anthropic: Claude can now create and edit files • Support guide.

Other Relevant AI News!

🎨 DH3 keeps winning blind A/Bs against Nano-Banana in community tests—try it on the Image Arena arena.
🐱 LongCat-Flash (560B MoE) from Meituan is open-weights with shortcut-connected MoE—grab the code on the repo.
🌍 HunyuanWorld-Voyager outputs 3D-consistent worlds from a single image with aligned RGB-Depth—see the technical report.
🧊 ReconViGen turns a handful of photos (or video) into accurate textured 3D models—try the project page.
🎮 Oasis 2.0 is a real-time, transformer-driven open world you can sample in your browser—launch the short demo.
🗣️ Chatterbox Multilingual brings 23-language TTS with expressive zero-shot cloning under open licensing—check the GitHub.
🎙️ VibeVoice 7B’s GitHub disappeared, but weights/cards are still online—see the ModelScope.
🤖 Figure 02 shows autonomous dishwasher loading with grasping and rack manipulation—watch the post.
🎧 AudioStory (Tencent ARC) generates long-form narrative audio aligned to video with Apache-2 code—explore the GitHub.

Golden Nuggets

🚀 Open MoE wins on value: Kimi K2 (0905) brings big-model coding to vertical, stack-specific copilots.
🎨 Editors > generators: DH3’s precision/identity fidelity makes surgical, production-grade image edits the new battleground.
📲 On-device RAG is ready: EmbeddingGemma enables private, offline search—use MRL 256-D for speed without big accuracy loss.
🌍 3D from scraps: HunyuanWorld-Voyager + ReconViGen turn single photos/few shots into consistent worlds and usable assets.
🛡️ Work + safety reset: Entry roles shift to AI orchestration/verification; transparent misuse reporting and guardrails are now table stakes.

What did you think about today's edition

Until our next AI rendezvous,

Anthony | Founder of Uncover AI