Taƙaitaccen Rahoto / Muhimman Abubuwan Da Aka Koya
- Àkàndé ⧉ wani taimakawa muryar Python mai buɗaɗɗen tushe ne da ke haɗa OpenAI Whisper maganar-zuwa-rubutu, GPT-4 chat completions, ɓoyewar amsoshin SQLite na gida, da fitarwa ta PDF na fpdf2 zuwa cikin tsarin aiki mai muryar ɗaya-ɗaya wanda bai buƙatar ajiyar gajimare ba da nauyin ƙirar AI na gida ba.
- Ɓoyewar SQLite tana adana hashes SHA-256 na ɗaruruwan tambayoyi masu daidaitawa da aka ɗaura da rubutu na amsar API; buɗaɗɗiyar ɓoyewa ba ta cinye alamun ba kuma ta dawo a ƙasa da 10 ms, wanda ke yin tambayoyi masu maimaitawa (kamar sake dubawa a yanke shawara daga farkon taron) a zahiri kyauta.
- Tattaunawar juzu'i-da-yawa ana kiyaye ta ta hanyar gina jerin
messagesa cikin ƙwaƙwalwa kuma ana wucewa da ita a kowane kiran Chat Completions API — ƙirar tana karɓar cikakken tarihin zaman don ta iya nufin musayoyin da suka gabata, da ƙarfin ƙaruwar amfani da alamun a kowane juyawa.- Ƙirƙirar taƙaitaccen PDF tana jera jerin
messagesna zaman zuwa takarda fpdf2 mai tsari: an lakabta juzu'in mai amfani da juzu'in mai taimakawa, an saka alamomin lokaci, kuma tsararren shafi kai tsaye yana kulawa da zaman na tsawon kowa; an rubuta fayil ɗin zuwa tsarin fayil na gida, ba a loda shi.- Iyakar keɓantawa: tambayar kai tsaye kawai (da tarihin zaman har zuwa iyakar taga mahallin) tana barin na'urar — ba a aika rikodojin sauti, fassarori, da amsoshi masu ɓoyewa zuwa wani sabis na nesa fiye da API na OpenAI.
Àkàndé ⧉ wani taimakawa muryar Python mai buɗaɗɗen tushe ne da aka gina shi a kusa da abubuwa uku masu haɗawa: OpenAI Whisper don gano magana, GPT-4 Chat Completions API don fahimtar harshe da ƙirƙirawa, da kuma keɓaɓɓen ɗakin bayanai na SQLite don ɓoyewa na amsoshi da dawwama ta zaman. Sakamakon haka tsarin aiki ne mai muryar da za a iya gudanar da shi a laptop ba tare da nauyin ƙira na gida, ababen more rayuwa na ajiyar layin kashe wuta, ko tarin kwantena ba.
Wannan labarin yana bayyana ginin fasaha na kowane sashi, yanke shawara na ƙira game da ɓoyewa da mahallin juzu'i-da-yawa, da bututun fitarwa na PDF.
Bayyani Janar na Bututu #
Mu'amala ɗaya ta Àkàndé tana bin wannan jeri:
- Ɗaukar sauti — mai amfani yana magana; aikace-aikacen yana rikodi sauti zuwa wani fayil ɗin WAV na ɗan lokaci ta amfani da
sounddeviceko wata dakin karatu na sauti mai dacewa. - Magana-zuwa-rubutu — ana aika fayil ɗin WAV zuwa
openai.audio.transcriptions.create()(Whisper API); ana dawowar rubutun a matsayin kirtani mai sauƙi. - Binciken ɓoyewa — an daidaita rubutun (an sanya haruffa ƙanana, an runtse farfajiya) kuma an hash SHA-256; an binciko hash ɗin a cikin tebur ɗin SQLite na gida na
response_cache. - Kiran API ko buɗaɗɗiyar ɓoyewa — a kan rashin, an ƙara rubutun zuwa jerin
messagesna zaman kuma an aika zuwaopenai.chat.completions.create(); an adana rubutun amsar a cikin ɓoyewa. - Rubutu-zuwa-magana — an canza rubutun amsar zuwa sauti ta amfani da endpoint ɗin
openai.audio.speech.create()(TTS) ko dakin karatu na TTS na gida, kuma ana sake wasa. - Fitarwa ta PDF (akan buƙata) — an jera cikakken jerin
messageszuwa takarda fpdf2 mai tsari kuma an rubuta zuwa diski.
Haɗakar OpenAI: Chat Completions da Whisper #
Àkàndé yana amfani da Python SDK na openai don gano magana da ƙirƙirar rubutu. Kiran fassarar Whisper:
with open(audio_file_path, "rb") as f:
transcript = openai.audio.transcriptions.create(
model="whisper-1",
file=f,
language=None # auto-detect
)
user_text = transcript.text
Kiran Chat Completions yana kula da jerin messages mai keɓaɓɓen zaman:
messages.append({"role": "user", "content": user_text})
response = openai.chat.completions.create(
model="gpt-4-turbo-preview",
messages=messages,
temperature=0.2,
max_tokens=1024
)
assistant_text = response.choices[0].message.content
messages.append({"role": "assistant", "content": assistant_text})
An haɗa da umarnin tsarin ɗaya a farkon zaman kuma yana sarrafa mutumin Àkàndé, tsarin fitarwa, da kowane ƙuntatawa na keɓaɓɓen fanni:
messages = [
{
"role": "system",
"content": (
"You are Àkàndé, a concise executive assistant. "
"Respond in plain prose. Do not use markdown. "
"If asked to summarise, produce three bullet points maximum."
)
}
]
Saita temperature=0.2 yana musanya bambancin ƙirƙira don ƙaddamar da ƙaddamar — yana da mahimmanci don tambayoyin gaskiya kamar tuna yanke shawara daga farkon zaman.
Ɓoyewar Amsar SQLite #
Tsarin ɓoyewa yana da ɗan ƙarami:
CREATE TABLE IF NOT EXISTS response_cache (
query_hash TEXT PRIMARY KEY,
response TEXT NOT NULL,
created_at INTEGER NOT NULL -- Unix timestamp
);
Hanyar bincike da rubuta:
import hashlib, sqlite3, time
def _normalise(text: str) -> str:
return " ".join(text.lower().split())
def cache_get(conn: sqlite3.Connection, query: str) -> str | None:
h = hashlib.sha256(_normalise(query).encode()).hexdigest()
row = conn.execute(
"SELECT response FROM response_cache WHERE query_hash = ?", (h,)
).fetchone()
return row[0] if row else None
def cache_set(conn: sqlite3.Connection, query: str, response: str) -> None:
h = hashlib.sha256(_normalise(query).encode()).hexdigest()
conn.execute(
"INSERT OR REPLACE INTO response_cache VALUES (?, ?, ?)",
(h, response, int(time.time()))
)
conn.commit()
INSERT OR REPLACE yana tabbatar da an sabunta amsar ɓoyewa idan an aika wannan tambayar bayan haɓaka ƙira. Ana iya tsara tambayar korar dangane da TTL (DELETE WHERE created_at < ?) a kan farawa don iyakance girman ɓoyewa.
Aikin buɗaɗɗiyar ɓoyewa: binciken SQLite a kan SSD na gida yana dawowar a ƙasa da 1 ms don teburorin zuwa ~100,000 layuka. Latency na zagaye guda na kiran API na GPT-4 kai tsaye yawanci 600–900 ms ne don amsoshi masu gajarta. Don bayanan yau da kullun tare da ƴan tambayoyi masu maimaitawa, ɓoyewar tana kawar da yawancin kiraye-kiraye na API bayan zaman farko.
Ƙirƙirar Taƙaitaccen PDF #
Fitarwa ta PDF tana amfani da fpdf2, dakin karatu na Python PDF mai kulawa ba tare da dogaro na binary ba:
from fpdf import FPDF
from datetime import datetime
def export_session_pdf(messages: list[dict], output_path: str) -> None:
pdf = FPDF()
pdf.add_page()
pdf.set_font("Helvetica", size=11)
pdf.set_margins(20, 20, 20)
pdf.set_font("Helvetica", "B", 14)
pdf.cell(0, 10, f"Àkàndé Session — {datetime.now():%Y-%m-%d %H:%M}", ln=True)
pdf.ln(4)
for msg in messages:
if msg["role"] == "system":
continue
label = "You" if msg["role"] == "user" else "Àkàndé"
pdf.set_font("Helvetica", "B", 10)
pdf.cell(0, 6, label, ln=True)
pdf.set_font("Helvetica", size=10)
pdf.multi_cell(0, 5, msg["content"])
pdf.ln(3)
pdf.output(output_path)
multi_cell() yana kulawa da layin-ɗaurewa da yankakken shafi kai tsaye, don haka zaman na kowane tsawon suna samar da takardar da aka tsara sosai ba tare da ma'anar tsararren shafi na hannu ba. Fitarwa wata fayil ce mai dacewa da PDF/A ba tare da rubutun da aka haɗa fiye da ma'auni na Helvetica na daidaitaccen ba.
Ƙirar Keɓantawa #
Iyakar keɓantawa a Àkàndé an bayyana ta da gaskiyoyi uku:
- Ana aika sauti zuwa Whisper API ta HTTPS kuma OpenAI ba ta adana shi bayan kiran API (bisa ga manufar amfani da bayanan API na OpenAI kamar na Fabrairu 2024).
- Kiraye-kiraye na Chat Completions API suna watsa jerin
messagesna zaman — wanda zai iya ƙunsar cikakken tarihin tattaunawar don zaman juzu'i-da-yawa. - Ɗakin bayanai na SQLite da fayilolin PDF suna zaune gaba ɗaya a kan tsarin fayil na gida; babu haɗin kai na baya zuwa wani sabis na gajimare.
Don shari'o'in amfani na muƙarrabi waɗanda ke ƙunshe da batutuwa masu hankali — tattaunawar M&A, al'amuran ma'aikata, dabarun ƙa'ida — ya kamata a duba tarihin zaman da aka watsa zuwa API akan manufar amfani da AI na ƙungiyar kafin tura. Ana iya amfani da iyakar max_tokens a kan umarnin tsarin don hana watsa mahallin da ba a da niyya wanda ya wuce iyakar bayyanar da aka nufa.
Tambayoyin Da Ake Yawancin Tambaya #
Shin Àkàndé yana riƙe tarihin tattaunawar bayan ƙarshen zaman?
Ana jefar da jerin messages na cikin ƙwaƙwalwa lokacin da tsarin ya fita. Ana riƙe tarihin tattaunawar kawai idan mai amfani ya ƙaddamar da fitarwa ta PDF ko kuma idan an ƙara keɓaɓɓen layin dawwama. Ɓoyewar SQLite tana adana hashes tambayoyi da rubutun amsa, ba cikakken mahallin tattaunawar ba.
Ta yaya ɓoyewa ke kulawa da tambayoyin da suke kama amma ba daidai suke ba? Ɓoyewa tana amfani da hashing daidai-daidai a kan ɗaruruwan tambayoyi masu daidaitawa. Tambayoyi biyu da suka bambanta da kalma ɗaya za su samar da hashes mabambanta kuma su haifar da kiraye-kiraye na API daban. Ɓoyewar ma'anar (amfani da kamancen haɗa don daidaita tambayoyi masu kusa-kwafawa) zai buƙaci ƙarin mataki na neman vector kuma ba wani ɓangare na aiwatarwa na asali ba.
Wacce ƙirar GPT ce Àkàndé ke amfani da ita ta tsoho?
Tsoho shine gpt-4-turbo-preview kamar na Fabrairu 2024. Sunan ƙirar ma'auni ne na tsarin saiti, don haka za a iya maye gurbin kowace ƙirar chat completion ta OpenAI. Canza zuwa gpt-3.5-turbo yana rage farashin API da kusan 20× a kowane alamun amma yana rage ingancin tunani don tambayoyin mataki-mataki masu rikitarwa.
Za a iya gyara tsarin fitarwa na PDF?
Eh. Aikin fitarwa na fpdf2 yana karɓar jerin messages a matsayin shigarwar da ake buƙata kawai, don haka ana iya canza rubutu, maauni, girman shafi, abubuwan taken, da lakabawa duka ta hanyar gyara aikin fitarwa. fpdf2 kuma yana tallafawa ƙara hotuna, teburorin, da rubutun Unicode, yana ba da damar tsarin takarda mai marzani don ƙungiyoyin da ke da buƙatun alama na musamman.
Nassoshi #
- OpenAI. Audio Transcriptions — Whisper API. OpenAI Platform Documentation, 2024. https://platform.openai.com/docs/api-reference/audio/createTranscription
- OpenAI. Chat Completions API. OpenAI Platform Documentation, 2024. https://platform.openai.com/docs/api-reference/chat/create
- Voss, J. et al. fpdf2: Modern PDF generation for Python. GitHub, 2024. https://github.com/py-pdf/fpdf2
- SQLite Consortium. SQLite Documentation. sqlite.org, 2024. https://www.sqlite.org/docs.html
Bita ta ƙarshe .
Bita ta ƙarshe .
