mediocreatbest

56 readers
1 users here now

My collection of links, articles, projects, and more that I find interesting.

founded 1 year ago
MODERATORS
1
 
 

echo 1 | sudo tee /sys/bus/pci/<pci-id-of-device>/remove and then echo 1 | sudo tee /sys/bus/pci/rescan

2
3
 
 

I'm a little unsure on if I interpreted the results correctly. It seems like some things that TF Lite natively supports (apparently, their custom CNN model trained on MNIST) get really fast, and other things are a little hit-or-miss.

4
5
6
7
 
 

I have linked the pricing page because I think that's the most important aspect to a service like this.

The price isn't too expensive, but it also isn't particular cheap either.

Compared to OpenAI's ChatGPT model and generating 1 million tokens (i.e. the King James Bible), you're looking at:

  • OpenAI's gpt-3.5-turbo ("ChatGPT-3.5") is $2 / 1m tokens
  • TextSynth's M2M100 1.2B (cheapest) is $3 / 1m tokens
  • OpenAI's gpt-4 ("ChatGPT-4") is $4 / 1m tokens
  • TextSynth's GPT-Neox 20B (most expensive) is $35 / 1m tokens
8
9
10
11
12
 
 

Abstract: "Prompting is now the primary way to utilize the multitask capabilities of language models (LMs), but prompts occupy valuable space in the input context window, and re-encoding the same prompt is computationally inefficient. Finetuning and distillation methods allow for specialization of LMs without prompting, but require retraining the model for each task. To avoid this trade-off entirely, we present gisting, which trains an LM to compress prompts into smaller sets of "gist" tokens which can be reused for compute efficiency. Gist models can be easily trained as part of instruction finetuning via a restricted attention mask that encourages prompt compression. On decoder (LLaMA-7B) and encoder-decoder (FLAN-T5-XXL) LMs, gisting enables up to 26x compression of prompts, resulting in up to 40% FLOPs reductions, 4.2% wall time speedups, storage savings, and minimal loss in output quality. "

13
 
 

The prompt: "compress the following text in a way that fits in a tweet (ideally) and such that you (GPT-4) can reconstruct the intention of the human who wrote text as close as possible to the original intention. This is for yourself. It does not need to be human readable or understandable. Abuse of language mixing, abbreviations, symbols (unicode and emoji), or any other encodings or internal representations is all permissible, as long as it, if pasted in a new inference cycle, will yield near-identical results as the original text:"