Mon, Apr 22, 24, Aidocent setup to performance test within 5 days
This is a draft, the content is not complete and of poor quality!

Project post

freelancers

Click to open

I need an experienced data scientist or AI specialist to assist with evaluating a Korean dataset using OpenAI API calls. my goal is to set up a good testing template I can repeat

Key Aspects:

  • Inference Speed: Achieving a balance of optimal inference speed is vital for this project. The desired inference speed for the evaluation stands at a moderate level when considering 2000 users.
  • Total Processing Accuracy: I’m looking to ensure the total processing accuracy of this evaluation is high.
  • TPA (Total Processing Accuracy): Aspects of TPA are crucial for this project; we aim for a balanced importance between total processing accuracy and inference speed. (<1 /s is imperative)
  • Implement this on a shared googleColab or Jupyternotebook for on-demand use of the application

Ideal Skills and Experience:

  • Proficiency in Korean language and understanding of language nuances.
  • Previous experience with data evaluation using OpenAI API calls.
  • Proven track record in achieving optimal inference speeds for data processing.
  • Demonstrated ability to balance and optimize processing accuracy alongside inference speed.
  • Strong communication skills to provide regular updates on the project’s progress.

refer to this documentation; https://python.langchain.com/docs/langsmith/walkthrough/

request for access to the example document where all the testing data and instructions should be maintained and updated; https://docs.google.com/document/d/16GFgABbbnH7RXgL56SXNu41xRspstgtVcy0TQxmwd-A/edit?usp=drive_link

API example; https://api-engtoprod.meta-wedit.com/api-docs#/USER%20API/UsersController_perpectConversation

deliverables; The test procedure as described and per our discussion to the format of the sample google doc. Set of scripts for JMeter configs, python for accuracy and inference test. (hopefully Colab) A working PC implementation to run the test, on a machine and VM (dockerized) to replicate the same testing environment repeatedly. A full documentation of test procedure, steps, screenshots, reference links, of running tests, and setups.

revised requirments

Key responsibilities:
- Conducting performance tests on LLMs.
- Analysing the results and comparing them to identify the best performing system.
- Providing recommendations on how to optimize the performance of the chosen LLM.
- Set up a local testing environment (chatbot, playground dashboard, Jmeter 5.5)
- Set up a cloud server (API server)
- Google Colab, Jupyter Notebook
- All configuration, test scripts in python, bash and exe)
- Dockerized images for future implementation
- Our Korean dataset of 20,000 published to HuggingFace workspace based on an existing Korean dataset
- documentation 1. testing items/procedures 2. instructions, manuals of this project.

Models to consider:
llama3-8b-8192
text-davinci-003
Mixtral 8x22b
Whisper

Reference:
https://pf7.eggs.or.kr/aigenerative_overview.html

Sample testing items (send me a permission request with your name plz)
https://docs.google.com/document/d/16GFgABbbnH7RXgL56SXNu41xRspstgtVcy0TQxmwd-A/edit

Project resources

googleDrive

overview

| TestItems | uconc | q&aDataset / hfConverse | hfQ&A / hfQ&A2 | projectBlog | hfQ&A3

datasets

hfDataset check under metrics for various testing items

HuggingFace Korean dataset 일상대화 : 다양한 질의답1 : 네이버 지식인 질의답 : 다양한 질의답2 :

koreanDataset
koreanDataset1
koreanDataset2
koreanDataset3

다양한 질의답2 : https://huggingface.co/datasets/unoooo/alpaca-korean

our dataset

ucon-slmDatasets

other models

Models to consider: llama3-8b-8192 text-davinci-003 Mixtral 8x22b Whisper

AWS server

WindowsServer port: on MS RemoteDesktop itsInstance accessible only via aws console

AWS Windows Server to run Jmeter to LLM application on the cloud

ApplicationServer

Windows Sever

Setup, Jmeter

Creds and IP

mlOps

googleColab

[JupyterNotebook]

LLM candidates

Models to consider: llama3-8b-8192 text-davinci-003 Mixtral 8x22b Whisper

misTral for mistral llm api docs

gorqCloud and playgroundConsole

ncSoftVaroq Korean LLM by NCSoft based varco llm and kendra

gpt4all locally hosted chatbot opensource.

Day 1, day2, day 3, day 4, day 5, day 6

Click to open

Day1

  1. Shared creds with freelancers
  2. Task assigned among the group
  3. AWS WindowServer instance
  4. Completed the testing pc on the server Day2
  5. Application Server
  6. API server
  7. Lib install and setup
  8. Download llm models
  9. Write scripts Day3 (today)
  10. Write scripts using quantization of llms models
  11. setup the application server.
  12. pythong library issue
  13. model candidates per quantization
  14. flask app installed

performance test

hfDatasetsMetrics

metrics

perplexity, accuracy, wer

accuracy

perplexity

wer

Click to open

├───metrics
│   ├───accuracy
│   ├───bertscore
│   ├───bleu
│   ├───bleurt
│   ├───cer
│   ├───chrf
│   ├───code_eval
│   ├───comet
│   ├───competition_math
│   ├───coval
│   ├───cuad
│   ├───exact_match
│   ├───f1
│   ├───frugalscore
│   ├───glue
│   ├───google_bleu
│   ├───indic_glue
│   ├───mae
│   ├───mahalanobis
│   ├───matthews_correlation
│   ├───mauve
│   ├───mean_iou
│   ├───meteor
│   ├───mse
│   ├───pearsonr
│   ├───perplexity
│   ├───precision
│   ├───recall
│   ├───roc_auc
│   ├───rouge
│   ├───sacrebleu
│   ├───sari
│   ├───seqeval
│   ├───spearmanr
│   ├───squad
│   ├───squad_v2
│   ├───super_glue
│   ├───ter
│   ├───wer
│   ├───wiki_split
│   ├───xnli
│   └───xtreme_s
`

The following wiki, pages and posts are tagged with

Title Type Excerpt
2021-10-04-wiki-colloseo.md post 추천의 원리 더 깊게 보기 클러스터링, 협업필터링, 프로파일링
2021-10-04-wiki-googleapi-image-search.md post 동영상 검색 기술을 활용한 서비스 등장
2021-10-04-wiki-recopic.md post 개인화추천- 이커머스- 클러스터링- 협업필터링- 프로파일링
2021-10-04-wiki-tmong.md post 서비스 제작 사례를 통해 서비스 기획 프로세스를 알아봅니다.
Weather app from firebase post Sunday-weather-app, open weather api
Bridging Language Barriers with Blockchain Technology post Tue, Apr 16, 24, LangChain is a revolutionary platform leveraging blockchain technology to facilitate seamless communication and collaboration across languag...
AWS Korean Voice ChatGPT: Enhancing Conversational AI with Hugging Face post Sat, Apr 20, 24, Leveraging state-of-the-art deep learning techniques and pretrained language models, Korean Voice ChatGPT enables seamless and natural conve...
Performance test for aidoncent based on EnglishTogether post Mon, Apr 22, 24, Aidocent setup to performance test within 5 days
aidocent performance test post Fri, Apr 26, 24, aidocent performance test
Exploring Edge AI Technologies post Tue, May 21, 2024, A comprehensive guide on Edge AI technologies, their opportunities, limitations, and practical applications.
Exploring Edge AI Technologies post Tue, May 21, 2024, A comprehensive guide on Edge AI technologies, their opportunities, limitations, and practical applications.
github and hf implementation post Wed, May 22, 24, run Mistral7B locally and integrate with existing llm app
FPGA Overview post Wednesday, FPGA is fast-growing and most adaptable ai application at the edge
Leave the routines to ai at the edge and always keep yourself on the loop post Wed, May 29, 24, prototyping an llm ai on fpga
Workflow and Architecture of AI models on Edge devices using FPGA post Fri, May 31, 24, comprehensive framework for deploying AI models at the edge, leveraging various technologies. how to connect with jupyternotebook
locally serving llm chatbots post Tue, Jun 11, 24, using langchain production ready llmrag
현장의 요구 사항을 반영한 실전 Voice AI 개발 플랫폼 post Sun, Sep 21, 25, Practical Voice AI platform that aligns field requirements with ASR/TTS/NLU integration, pipelines, and deployment
Voice Platform — Extract from Kor2Unity summary post Mon, Sep 22, 25, Extracted items from Kor2Unity issue summary
Goorm AI워크로드 최적화 클라우드 엔지니어링 트랙 지원 post Mon, Sep 22, 25, AI 트랙 지원용 링크 모음
Exploring Jetson Nano in AIoT Applications page Jetson Nano serves as a potent platform for Edge AI applications, supporting popular frameworks like TensorFlow, PyTorch, and ONNX. Its compact form factor a...
🔭sensor detection page RealSense with Open3D

{# nothing on index to avoid visible raw text #}