模型:
philschmid/t5-11b-sharded
This is fork of t5-11b implementing a custom handler.py as an example for how to use t5-11b with inference-endpoints on a single NVIDIA T4.
Hugging Face Inference endpoints can be used with an HTTP client in any language. We will use Python and the requests library to send our requests. (make your you have it installed pip install requests )
import json import requests as r ENDPOINT_URL=""# url of your endpoint HF_TOKEN="" # payload samples regular_payload = { "inputs": "translate English to German: The weather is nice today." } parameter_payload = { "inputs": "translate English to German: Hello my name is Philipp and I am a Technical Leader at Hugging Face", "parameters" : { "max_length": 40, } } # HTTP headers for authorization headers= { "Authorization": f"Bearer {HF_TOKEN}", "Content-Type": "application/json" } # send request response = r.post(ENDPOINT_URL, headers=headers, json=paramter_payload) generated_text = response.json() print(generated_text)