>>100160332
Yeah so I was making and interpreting the request incorrectly. The only headers you need to set are content type (json) and auth token.
The response from fiz will always be "chunked". The response content is split into several byte arrays that you have to convert to UTF-8 and construct the full message before being able to decode the json.
The param 'stream=true' has to be passed both in post method and in the post data.
The implementation is actually standard OpenAI and not fiz-specific.
post_data = {"model": "gpt-3.5-turbo",
"messages": [{"role": "user", "content": "say test"}],
"max_tokens": 100,
"stream": True} # <---
response = requests.post(endpoint,
headers=headers,
data=json.dumps(post_data),
stream=True, # <----
timeout=30)
message_load = ""
for value in response.iter_lines(chunk_size=64, # Does not matter much, but 1024 max
delimiter="data:",
decode_unicode=True):
try:
resp = json.loads(value)
incomplete_message = get_message(resp)
if incomplete_message:
message_load += incomplete_message
resp["choices"][0]["delta"]["content"] = message_load
except json.decoder.JSONDecodeError:
pass
print("output=",message_load)