Improve API #18

jethrogb · 2016-08-06T23:09:48Z

I would like to build a function

fn read_headers<'a,'b,R: BufRead>(
        stream: &mut R,
        buf: &'b mut Vec<u8>,
        headers: &'a mut [Header<'b>]
    ) -> Result<Request<'a,'b>,E>

that reads as much from stream into buf as necessary to get a Complete return from Request::parse. This turns out to not be trivial and require lots of extra allocations and work.

Here's what I came up with:

fn read_headers<'a,'b,R: BufRead>(clnt: &mut R, buf: &'b mut Vec<u8>, headers: &'a mut [Header<'b>]) -> Result<Request<'a,'b>,String> {
    fn extend_and_parse<R: BufRead>(clnt: &mut R, headers: &mut [Header]) -> Result<Vec<u8>,String> {
        let mut buf=Vec::<u8>::new();
        let len=headers.len();
        loop {
            let buf_orig_len=buf.len();
            let additional_len={
                let additional=try!(clnt.fill_buf().map_err(|e|e.to_string()));
                buf.extend_from_slice(additional);
                additional.len()
            };
            let mut headers=Vec::with_capacity(len);
            headers.resize(len,httparse::EMPTY_HEADER);
            let mut req=Request::new(&mut headers);
            match req.parse(&buf) {
                Ok(httparse::Status::Complete(n)) => {
                    clnt.consume(n-buf_orig_len);
                    break
                },
                Ok(httparse::Status::Partial) => {
                    clnt.consume(additional_len);
                }
                Err(e) => return Err(format!("HTTP parse error {:?}",e)),
            };
        }
        Ok(buf)
    }

    let result=extend_and_parse(clnt,headers);
    result.map(move|nb|{
        ::core::mem::replace(buf,nb);
        let mut req=Request::new(headers);
        req.parse(buf);
        req
    })
}

The main issues are having to allocate a new array of temporary headers for every iteration, and having to parse the succesful result twice.

I think this is partially Rust's fault, but also partially httparse for having a not so great API. For example, the lifetime of everything is fixed upon Request creation, so that a parse failure doesn't release the borrow of the input buffer.

The text was updated successfully, but these errors were encountered:

seanmonstar · 2016-08-09T23:42:43Z

I agree that the lifetime situation feels difficult. If I could fix anything, I'd fix the feeling of trying to use httparse.

As for your specific example, I don't believe allocationg a Vec of headers is necessary, unless you need to at runtime decide how many is the maximum allowed.

jethrogb · 2016-08-09T23:44:59Z

As for your specific example, I don't believe allocationg a Vec of headers is necessary, unless you need to at runtime decide how many is the maximum allowed.

You need a buffer that's as big as the input slice headers. You can't do this statically until we get integer generics.

seanmonstar · 2016-08-09T23:50:33Z

Sure. Who provides the headers slice? I'm actually quite curious at this design, because I haven't experienced it both in sync hyper and async hyper, and I haven't seen others use it that way in other mini http implementations.

For reference, this is how hyper 0.9 would keep reading into a buffer synchronously and try to parse again: https://github.com/hyperium/hyper/blob/0.9.x/src/http/h1.rs#L869-L899

jethrogb · 2016-08-10T00:00:17Z

I'm actually quite curious at this design, because I haven't experienced it both in sync hyper and async hyper, and I haven't seen others use it that way in other mini http implementations.

For reference, this is how hyper 0.9 would keep reading into a buffer synchronously and try to parse again: https://github.com/hyperium/hyper/blob/0.9.x/src/http/h1.rs#L869-L899

This looks like it basically runs the same loop as me above, except with a fixed max headers, which means you're exactly avoiding the issues I'm running into. I'd like the caller to specify the max headers as well as avoid large additional allocations (either on the heap or stack) during the parsing. Also I'd like to keep the httparse Request representation, in hyper it you're just interpreting the Request directly before returning.

seanmonstar · 2016-08-21T16:54:38Z

Been thinking about this, what if Headers were not an array of slices, but array of offsets? It'd remove the lifetimes, but require you to slice the buffer yourself with the offsets to get the name and value.

seanmonstar · 2016-08-21T19:37:26Z

Another option is to make the parse function to update the lifetime of the Request. This might require an unsafe transmute, which would require that parse ensures all of the fields of Request, include the headers array, are reset, to prevent any dangling references.

jethrogb · 2016-08-23T00:56:58Z

What about merging new and parse into one associated function?

fn parse(headers: &'h mut [Header<'b>], buf: &'b [u8]) -> Result<(usize,Request<'h, 'b>)>

You'd need to change the Status enum as well...

seanmonstar · 2016-08-23T01:01:40Z

That could be done, but it still ties the lifetime of the Header slice to the buf, and it makes the Status trickier, as you said.

I think a reset() method can be added to Request, that will empty all the fields and reset the Header slice, and update it's lifetime to allow a new buffer to be used.

adriangb · 2022-11-07T20:36:30Z

I wrote a little Python wrapper for this library. From the perspective of writing a Python wrapper with PyO3 the lifetime on Headers is problematic.

If you're using the flow of "try to parse, if not tell the Python caller to come back with more data" you have to re-allocate headers every time because you can't persist anything with a Header in it (including the buffer) to Python because of the lifetime (PyO3 can't hand off lifetimes to Python).

The lifetime requirement on the data is not a problem, PyO3 let's you access the underlaying memory without copying stuff.

earthengine mentioned this issue Jan 26, 2019

Add dynamic headers support #51

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve API #18

Improve API #18

jethrogb commented Aug 6, 2016 •

edited

Loading

seanmonstar commented Aug 9, 2016

jethrogb commented Aug 9, 2016

seanmonstar commented Aug 9, 2016

jethrogb commented Aug 10, 2016 •

edited

Loading

seanmonstar commented Aug 21, 2016

seanmonstar commented Aug 21, 2016

jethrogb commented Aug 23, 2016

seanmonstar commented Aug 23, 2016

adriangb commented Nov 7, 2022 •

edited

Loading

Improve API #18

Improve API #18

Comments

jethrogb commented Aug 6, 2016 • edited Loading

seanmonstar commented Aug 9, 2016

jethrogb commented Aug 9, 2016

seanmonstar commented Aug 9, 2016

jethrogb commented Aug 10, 2016 • edited Loading

seanmonstar commented Aug 21, 2016

seanmonstar commented Aug 21, 2016

jethrogb commented Aug 23, 2016

seanmonstar commented Aug 23, 2016

adriangb commented Nov 7, 2022 • edited Loading

jethrogb commented Aug 6, 2016 •

edited

Loading

jethrogb commented Aug 10, 2016 •

edited

Loading

adriangb commented Nov 7, 2022 •

edited

Loading