In need of Better encoding support for HttpRequest


#1

Soo I’m trying to cheap-hack a way to save an image to my bot using essentially what you would for (customapi url).

For example. Let’s say I wanted to download this image:

I would make a request like this:

    var HttpResponse = Packages.com.gmt2001.HttpResponse,
        HttpRequest = Packages.com.gmt2001.HttpRequest,
        HashMap = Packages.java.util.HashMap,
        responseData = HttpRequest.getData(HttpRequest.RequestType.GET, 'https://i.imgur.com/FcZ7KRg.jpg', '', new HashMap());

    var response = responseData.content;
	
	$.writeToFile(response, './addons/earth.png', false);

Which in turn would save a bunch of data to the file “earth.png”.

The problem is what data is being saved.

So if you were to download the above image and open it in a text editor like Notepad or Notepad++, you’d get some crazy jargon like this:

ÿØÿà JFIF ÿþ 4Optimized by JPEGmini 3.14.14.72670860 0x813388d2 ÿÛ C aaaa

Let’s compare that string to the one that was downloaded using the HttpRequest:

���� JFIF �� 4Optimized by JPEGmini 3.14.14.72670860 0x813388d2 �� C aaaa

Well well well… We now see some differences right away. The entire file is like this. Most of the symbols have been converted into the which is a replacement character in unicode.

This issue also exists when you use the HTTP API for SQLite and your text has characters like emojis or symbols like above.

If there’s something that could be done (or that I could do) that’d be amazing.


#2

Already ahead of you. Checkout com.illusionaryone.ImgDownload. You can find an example in the TwitterHandler.js file. This is how we get your image from Twitch and upload it to Twitter when you go live. Of course, you just care about the download part, so look at that part of the code.

You will note that it downloads to ./addons/downloadHTTP/ by default. I suppose we could look at changing that as a parameter, or, I believe you could just override with the filename (last parameter). Anyway, hopefully that does what you want.


#3

tbh, I’m not worried where it downloads, as long as I can find it in the correct directory :stuck_out_tongue:

But thanks for that. I tried searching as much as I could and of course something gets overlooked haha.


#4

No problem, this really makes me think that we need to implement JavaDoc and compile that o.O


#5

HttpRequest converts to string using the encoding hinted by the Content-Encoding header. If the header is not present, it falls back to whatever your OS says is the default. The issue was probably a missing Content-Encoding header causing a fallback to US-ASCII, the Windows default.


#6

So on that note, how would you fix the encoding when using the HTTP SQLite API?


#7

You mean the PhantomBot API that returns the data? It just returns JSON objects, if that is the one you are talking about?


#8

I would have to investigate where the issue is occurring there, as we already set our Content-Encoding header to UTF-8


#9

Are you using version 2.4.0? Pretty sure I’ve fixed an issue similar to this a few weeks ago.


#10

Well, the reason I ask is because a command in one of my bots is full of emojis:

Which end up showing up like:

dongdrop:���������������������� DROP THAT DONG��������������������������������

in the request.

And yes this was tested on 2.4.0


#11

When trying to access the HTTP API from Chrome, I have no problems with non-ASCII characters. I am not sure what you are using that is causing an issue.

https://1drv.ms/i/s!AqJWOcIz2x-UgvNRQWsmu4Rh1IlS3Q


#12

I suppose what I’m viewing it in does make a difference.

I’m working with Java and trying to get the data to download/display correctly in a text document. Getting it to display in java itself is probably another issue I’ll have to work on, on my own.