I have a client who needs to limit the number of requests made to my Azure Translation API service. I found information from Microsoft on how to implement request throttling, but it's unclear where exactly in the request this throttling data should be included. Do I need to add it to the request headers?
Here is the link from Microsoft that discusses implementing flexible throttling: https://learn.microsoft.com/en-us/azure/api-management/api-management-sample-flexible-throttling
Below is an example curl command with rate limiting headers at the end. Is this the correct approach to implement request throttling?
// Pass secret key and region using headers to a custom endpoint
curl -X POST " my-ch-n.cognitiveservices.azure.com/translator/text/v3.0/translate?to=fr" \
-H "Ocp-Apim-Subscription-Key: xxx" \
-H "Ocp-Apim-Subscription-Region: switzerlandnorth" \
-H "Content-Type: application/json" \
-H "rate-limit-by-key: calls=10 renewal-period=60 counter-key=1.1.1.1" \
-d "[{'Text':'Hello'}]" -v