← Back to Blog

Speed Up GPT-J-6B: From Minutes to Seconds with GGML Conversion

November 26, 2023

A guide on dramatically improving GPT-J-6B inference speed. I'll show you how I converted a Hugging Face model from its standard format to GGML, slashing wait times from minutes to seconds on my own machine. Covers the conversion process, memory needs, and running the optimized model.

Loading...

Related Posts

Testing Twelve AI Image Models with Marcel Proust's Madeleine (Or: How We Chose Hunyuan for Hexagram 24)

Upgrading 8-Bit Oracle's Hexagram 24: Return with Proust's famous involuntary memory scene, then systematically testing 12 diverse AI image models through fal.ai to find which one actually captures that precise moment when past and present collapse together.

November 18, 2025

On Going Legit on Twitter/X (Or: How I Learned to Stop Scraping and Love the Rate Limit)

A three-day journey through Twitter API v2, v1.1, OAuth, rate limits, and the gradual acceptance that you can't actually hack your way around free-tier constraints

November 10, 2025

LLMs Love Markdown. CMSs Love Databases. I Built a WordPress Bridge Nobody Should Need.

A technical post-mortem on why WordPress's REST API is theoretically elegant but practically unusable for CI/CD, how SiteGround's security theater blocks legitimate automation, and why SSH + WP-CLI is the uncomfortable truth about 'modern' CMS workflows.

November 7, 2025
← Back to Blog
Projects·Blog·About·8-Bit Oracle·Press