-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set the rule handle after flush #88
base: main
Are you sure you want to change the base?
Conversation
@alexispires check out this library github.com/sbezverk/nftableslib it is based on github.com/google/nftables and it allows to do easily what you need. |
I think this change would break 1 important use case, when Flush() is called only once for everything, chains, rules, sets etc. We have a use case when everything gets prepared and then programmed in a single shot. I do not see how proposed implementation would address such case. |
Look at my test function, I send 1 table, 1 chain and 2 rules in a single transaction |
When there are several chains, multiple rules how do you insure correspondence with returned handle and rule? It seems you rely only on sequence, I do not think it is reliable enough. Still it is better to extend functionality which ensures backward compatibility and not change with breakage. |
I understand your concern. The same problem appeared to nft tool and the same assumptions were made, which makes them acceptable to me. Look at this commit : http://git.netfilter.org/nftables/commit/?id=bb32d8db9a125d9676f87866e48ffbf0221ec16a
The proposal to extend the functionality bothers me since it introduces a behavior specific to the go implementation. The notion of Handle is already used to identify the rules and the echo function exist mainly to return the Handle. Look at this commit too: http://git.netfilter.org/nftables/commit/?id=b99c4d072d9969f7a0dfc539b2b68b517f90af68 |
The work you quoted geared towards less dynamic and more static implementations, imho. My focus is in dynamic, concurrent use uses, example kube-proxy using nftables. Here is scenario when this approach pretty much guarantees the issue, there are 2 processes handling ipv4 and ipv6 tables concurrently, there is a possibility that ipv4 adn ipv6 rules are put in the same messages queue for Flush to program, if you cannot identify precisely which rule handle corresponds to which rule, then it is a matter of chance before they get mixed up. I think using |
It is a public and shared library, what is acceptable to you might not work for others, let's find a solution which is not breaking the current behaviour and providing you with what you need for your use case. |
I’m not sure yet whether we should support this behavior. For robustness, programs should generally do idempotent operations where possible. Changing the program from its current “modify/modify” approach to a “modify, query/modify” approach lends itself better to making the program work correctly when interrupted. Can you share some more details regarding what you’re trying to do? |
Today there's only two way to delete or replace a rule with this library:
IMO the right way is to get back the handle after flush. I can't tell much about what I'm trying to do (commercial product) but it's like a SDN application. I store in memory inside a container (so not problem with interruption it's almost stateless) all my rules to delete them after. I'm trying to code it in a strong way by matching the sequence number of request (rule by rule) with sequence of the response wich contain handle. Maybe it can be stronger. |
@stapelberg @sbezverk select(4, [3], NULL, NULL, {tv_sec=0, tv_usec=0}) = 1 (in [3], left {tv_sec=0, tv_usec=0})
recvmsg(......)
select(4, [3], NULL, NULL, {tv_sec=0, tv_usec=0}) = 0 (Timeout) |
I made a proto with select, what do you think? conn_linux.go (from github.com/mdlayher/netlink) func (c *conn) Select() (int, error) {
var fdSet unix.FdSet
fdSet.Zero()
fdSet.Set(c.s.FD())
n, err := unix.Select(c.s.FD()+1, &fdSet, nil, nil, &unix.Timeval{})
return n, err
} conn.go (from github.com/google/nftables) smsg, err := conn.SendMessages(batch(cc.messages))
if err != nil {
return fmt.Errorf("SendMessages: %w", err)
}
// Retrieving of seq number associated to entities
entitiesBySeq := make(map[uint32]Entity)
for i, e := range cc.entities {
entitiesBySeq[smsg[i].Header.Sequence] = e
}
// Trigger entities callback
for i := 1; i > 0; i, _ = conn.Select() {
rmsg, err := conn.Receive()
if err != nil {
return fmt.Errorf("Receive: %w", err)
}
for _, msg := range rmsg {
if e, ok := entitiesBySeq[msg.Header.Sequence]; ok {
e.HandleResponse(msg)
}
}
} |
I just added a more robust test suites (4096 rules added in // by 16 workers) with assert which rely on UserData to check Handle value. |
Cleaner version below, but both works and are reliable. diff --git a/conn.go b/conn.go
index 6a552d8..34cf0b9 100644
--- a/conn.go
+++ b/conn.go
@@ -17,7 +17,6 @@ package nftables
import (
"fmt"
"sync"
- "sync/atomic"
"github.com/google/nftables/expr"
"github.com/mdlayher/netlink"
@@ -41,7 +40,7 @@ type Conn struct {
sync.Mutex
put sync.Mutex
messages []netlink.Message
- entities map[int32]Entity
+ entities map[int]Entity
it int32
err error
}
@@ -52,7 +51,6 @@ func (cc *Conn) Flush() error {
defer func() {
cc.messages = nil
cc.entities = nil
- cc.it = 0
cc.Unlock()
}()
if len(cc.messages) == 0 {
@@ -71,7 +69,7 @@ func (cc *Conn) Flush() error {
cc.endBatch(cc.messages)
- _, err = conn.SendMessages(cc.messages[:cc.it+1])
+ _, err = conn.SendMessages(cc.messages)
if err != nil {
return fmt.Errorf("SendMessages: %w", err)
@@ -104,36 +102,29 @@ func (cc *Conn) Flush() error {
}
// PutMessage store netlink message to sent after
-func (cc *Conn) PutMessage(msg netlink.Message) int32 {
+func (cc *Conn) PutMessage(msg netlink.Message) int {
cc.put.Lock()
defer cc.put.Unlock()
if cc.messages == nil {
- cc.messages = make([]netlink.Message, 16)
- cc.messages[0] = netlink.Message{
+ cc.messages = append(cc.messages, netlink.Message{
Header: netlink.Header{
Type: netlink.HeaderType(unix.NFNL_MSG_BATCH_BEGIN),
Flags: netlink.Request,
},
Data: extraHeader(0, unix.NFNL_SUBSYS_NFTABLES),
- }
- }
-
- i := atomic.AddInt32(&cc.it, 1)
-
- if len(cc.messages) <= int(i) {
- cc.messages = resize(cc.messages)
+ })
}
- cc.messages[i] = msg
+ cc.messages = append(cc.messages, msg)
- return i
+ return len(cc.messages) - 1
}
// PutEntity store entity to relate to netlink response
-func (cc *Conn) PutEntity(i int32, e Entity) {
+func (cc *Conn) PutEntity(i int, e Entity) {
if cc.entities == nil {
- cc.entities = make(map[int32]Entity)
+ cc.entities = make(map[int]Entity)
}
cc.entities[i] = e
}
@@ -214,23 +205,14 @@ func (cc *Conn) marshalExpr(e expr.Any) []byte {
func (cc *Conn) endBatch(messages []netlink.Message) {
- i := atomic.AddInt32(&cc.it, 1)
-
- if len(cc.messages) <= int(i) {
- cc.messages = resize(cc.messages)
- }
+ cc.put.Lock()
+ defer cc.put.Unlock()
- cc.messages[i] = netlink.Message{
+ cc.messages = append(cc.messages, netlink.Message{
Header: netlink.Header{
Type: netlink.HeaderType(unix.NFNL_MSG_BATCH_END),
Flags: netlink.Request,
},
Data: extraHeader(0, unix.NFNL_SUBSYS_NFTABLES),
- }
-}
-
-func resize(messages []netlink.Message) []netlink.Message {
- new := make([]netlink.Message, cap(messages)*2)
- copy(new, messages)
- return new
+ })
} |
@alexispires could you please demonstrate that with this approach you can create rules with a reference to anonymous set. The reason for this request is the fact that creation of anonymous set must happen within the same transaction of a rule creation. It is not clear to me if it works with this solution. |
I'm not sure to understand, there are tests for this case (TestCreateUseAnonymousSet), you mean if handle ref is still retrieved ? |
@alexispires The test case only tests the sequence of bytes that gets sent, not the actual kernel/netfilter programming, unfortunately travis does not allow this type of "e2e" tests. I was asking you a favour, to build an actual simple binary, based on your change that programs a rule with a reference to an anonymous set and then run 'nft list table {your table name}' to confirm it. |
@sbezverk ok no problem, i'll come back to you |
It looks good: Before flush rule Handle is 0
After flush rule Handle is 3
table ip filter {
chain nfpoc {
type filter hook postrouting priority 0; policy accept;
tcp dport { tftp, 1163 } drop # handle 3
}
} Code (also in my github, with binary included for x64: https://github.com/alexispires/nfpoc): package main
import (
"fmt"
"os/exec"
"github.com/google/nftables"
"github.com/google/nftables/binaryutil"
"github.com/google/nftables/expr"
"golang.org/x/sys/unix"
)
func main() {
c := &nftables.Conn{}
table := c.AddTable(&nftables.Table{
Family: nftables.TableFamilyIPv4,
Name: "filter",
})
chain := c.AddChain(&nftables.Chain{
Name: "nfpoc",
Table: table,
Type: nftables.ChainTypeFilter,
Hooknum: nftables.ChainHookPostrouting,
Priority: nftables.ChainPriorityFilter,
})
set := &nftables.Set{
Anonymous: true,
Constant: true,
Table: table,
KeyType: nftables.TypeInetService,
}
if err := c.AddSet(set, []nftables.SetElement{
{Key: binaryutil.BigEndian.PutUint16(69)},
{Key: binaryutil.BigEndian.PutUint16(1163)},
}); err != nil {
fmt.Printf("c.AddSet() failed: %s", err.Error())
}
r := c.AddRule(&nftables.Rule{
Table: table,
Chain: chain,
Exprs: []expr.Any{
// [ meta load l4proto => reg 1 ]
&expr.Meta{Key: expr.MetaKeyL4PROTO, Register: 1},
// [ cmp eq reg 1 0x00000006 ]
&expr.Cmp{
Op: expr.CmpOpEq,
Register: 1,
Data: []byte{unix.IPPROTO_TCP},
},
// [ payload load 2b @ transport header + 2 => reg 1 ]
&expr.Payload{
DestRegister: 1,
Base: expr.PayloadBaseTransportHeader,
Offset: 2,
Len: 2,
},
// [ lookup reg 1 set __set%d ]
&expr.Lookup{
SourceRegister: 1,
SetName: set.Name,
SetID: set.ID,
},
// [ immediate reg 0 drop ]
&expr.Verdict{
Kind: expr.VerdictDrop,
},
},
})
fmt.Printf("Before flush rule Handle is %d\n", r.Handle)
if err := c.Flush(); err != nil {
fmt.Printf(err.Error())
}
fmt.Printf("After flush rule Handle is %d\n", r.Handle)
out, err := exec.Command("/usr/sbin/nft", "-a", "list", "table", "filter").Output()
if err != nil {
fmt.Printf(err.Error())
}
fmt.Printf("%s\n", out)
} |
@stapelberg do you need more informations? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! First review pass done
return fmt.Errorf("Receive: %w", err) | ||
// Retrieving of seq number associated to entities | ||
entitiesBySeq := make(map[uint32]Entity) | ||
for i, e := range cc.entities { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of the separate entites
book-keeping, why not just iterate over cc.messages and check if the type implements HandleResponse?
for _, msg := range cc.messages {
m, ok := msg.(interface { HandleResponse(netlink.Message) })
if !ok {
continue
}
m.HandleResponse(rmsg)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HandleResponse is not a method that belong to netlink.Message. It's belong to entity.
For a given entity, it's perform some operations on it by using the netlink response. We have to connect an entity (like a Rule) and a netlink message in case of the slice of responses is unordered. I connect them with the netlink sequence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. I’m still not particularly happy about having to manage messages
and entities
, which are separate but connected.
Instead of the current approach, we could make messages
take an interface type which has a NetlinkMessage
method. putMessages would then wrap a netlink.Message
in a type which implements NetlinkMessage
, and can optionally implement HandleResponse
as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stapelberg I'm not sure to understand what do you have in mind but the main limit I have is that the SendMessages method of netlink library take a slice of netlink.Message and not a slice of netlink.Message's pointer. So the only data struct that give me the netlink sequence number is the slice of netlink.Message returned by sendMessages. So the slice of netlink.Message have to be connected in one way or another to something else. In your solution I have to connect the type that wrap netlink.Message with the netlink.Message that contains the seq even if both are the same netlink message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m a little confused now: you’re saying the sequence numbers are only set in the []netlink.Message that’s returned by (*netlink.Conn).SendMessages
, but in your PR, you’re discarding that result:
if _, err = conn.SendMessages(cc.messages); err != nil {
Is that a bug in your PR, or am I misunderstanding?
(Independently, related to your question: the fact that the netlink package takes a []netlink.Message instead of []*netlink.Message doesn’t limit us in which data structures we want to use. If need be, we can do a copy from one to the other. A netlink.Message’s Header is just a few ints, and copying the Body byte slice won’t need to copy its contents.)
} | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this test running many workers concurrently? In other words: why is not sufficient to test with 1 worker?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To test the behaviors with concurrency as exprimed here: #88 (comment)
IMO it's safer to keep it to identify regression on concurent access. But it's not specific on this part of lib, I think concurrency have to be tested on the whole lib.
@alexispires @stapelberg Can we preserve old Flush() call the way it is now, but new changes implement in something like FlushWithCallback. I will not be able to switch to new functionality right away, maybe eventually after running some performance tests. |
I let @stapelberg decide about it, but what do you suggest as API ? |
@alexispires I was suggesting to have old API and new API in parallel, it will help to preserve the backward compatibility and give us a chance to move to new API when confidence/timing is right. |
TestDial nltest.Func // for testing only; passed to nltest.Dial | ||
NetNS int // Network namespace netlink will interact with. | ||
entities map[int]Entity | ||
messagesMu sync.Mutex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move messagesMu and messages down and separate them with a newline (to form a group of fields):
type Conn struct {
TestDial nltest.Func
// …
entities map[int]Entity
messagesMu sync.Mutex
messages []netlink.Message
}
} | ||
|
||
// Trigger entities callback | ||
msg, err := cc.checkReceive(conn) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why call checkReceive instead of just calling conn.Receive? It’s not clear to me whether you’re blocking until the next message is available, or only reading as long as messages are available. If the latter, that doesn’t seem sound to me—instead, we should read blockingly until we can identify the end of the response.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's to read as long as message are available. I'm not sure how to detect the end of the response without select.
Here an example of batch with echo flag:
sendmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=[{{len=20, type=NFNL_MSG_BATCH_BEGIN, flags=NLM_F_REQUEST, seq=3, pid=0}, "\x00\x00\x0a\x00"}, {{len=36, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWTABLE, flags=NLM_F_REQUEST|NLM_F_ECHO, seq=4, pid=0}, "\x02\x00\x00\x00\x08\x00\x01\x00\x6e\x61\x74\x00\x08\x00\x02\x00\x00\x00\x00\x00"}, {{len=84, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWCHAIN, flags=NLM_F_REQUEST|NLM_F_ECHO|NLM_F_CREATE, seq=5, pid=0}, "\x02\x00\x00\x00\x08\x00\x01\x00\x6e\x61\x74\x00\x0f\x00\x03\x00\x70\x72\x65\x72\x6f\x75\x74\x69\x6e\x67\x00\x00\x14\x00\x04\x80\x08\x00\x01\x00\x00\x00\x00\x00\x08\x00\x02\x00\x00\x00\x00\x00\x08\x00\x05\x00\x00\x00\x00\x01\x0b\x00\x07\x00\x66\x69\x6c\x74\x65\x72\x00\x00"}, {{len=84, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWCHAIN, flags=NLM_F_REQUEST|NLM_F_ECHO|NLM_F_CREATE, seq=6, pid=0}, "\x02\x00\x00\x00\x08\x00\x01\x00\x6e\x61\x74\x00\x10\x00\x03\x00\x70\x6f\x73\x74\x72\x6f\x75\x74\x69\x6e\x67\x00\x14\x00\x04\x80\x08\x00\x01\x00\x00\x00\x00\x04\x08\x00\x02\x00\x00\x00\x00\x64\x08\x00\x05\x00\x00\x00\x00\x01\x0b\x00\x07\x00\x66\x69\x6c\x74\x65\x72\x00\x00"}, {{len=96, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWSET, flags=NLM_F_REQUEST|NLM_F_ECHO|NLM_F_CREATE, seq=7, pid=0}, "\x02\x00\x00\x00\x08\x00\x01\x00\x6e\x61\x74\x00\x0c\x00\x02\x00\x5f\x5f\x73\x65\x74\x25\x64\x00\x08\x00\x03\x00\x00\x00\x00\x03\x08\x00\x04\x00\x00\x00\x00\x07\x08\x00\x05\x00\x00\x00\x00\x04\x08\x00\x0a\x00\x00\x00\x00\x04\x0c\x00\x09\x80\x08\x00\x01\x00\x00\x00\x00\x04\x0a\x00\x0d\x00\x00\x04\x02\x00\x00\x00\x00\x00"}, {{len=116, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWSETELEM, flags=NLM_F_REQUEST|NLM_F_ECHO|NLM_F_CREATE, seq=7, pid=0}, "\x02\x00\x00\x00\x0c\x00\x02\x00\x5f\x5f\x73\x65\x74\x25\x64\x00\x08\x00\x04\x00\x00\x00\x00\x04\x08\x00\x01\x00\x6e\x61\x74\x00\x44\x00\x03\x80\x10\x00\x01\x80\x0c\x00\x01\x80\x08\x00\x01\x00\x54\x4d\x28\x84\x10\x00\x02\x80\x0c\x00\x01\x80\x08\x00\x01\x00\xb0\x1f\x35\x63\x10\x00\x03\x80\x0c\x00\x01\x80\x08\x00\x01\x00\x51\x13\x60\x94\x10\x00\x04\x80\x0c\x00\x01\x80\x08\x00\x01\x00\x8a\x64\x3e\x08"}, {{len=168, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWRULE, flags=NLM_F_REQUEST|NLM_F_ECHO|NLM_F_EXCL|NLM_F_CREATE|NLM_F_APPEND, seq=8, pid=0}, "\x02\x00\x00\x00\x08\x00\x01\x00\x6e\x61\x74\x00\x0f\x00\x02\x00\x70\x72\x65\x72\x6f\x75\x74\x69\x6e\x67\x00\x00\x7c\x00\x04\x80\x34\x00\x01\x80\x0c\x00\x01\x00\x70\x61\x79\x6c\x6f\x61\x64\x00\x24\x00\x02\x80\x08\x00\x01\x00\x00\x00\x00\x01\x08\x00\x02\x00\x00\x00\x00\x01\x08\x00\x03\x00\x00\x00\x00\x0c\x08\x00\x04\x00\x00\x00\x00\x04\x30\x00\x01\x80\x0b\x00\x01\x00\x6c\x6f\x6f\x6b\x75\x70\x00\x00\x20\x00\x02\x80\x08\x00\x02\x00\x00\x00\x00\x01\x0c\x00\x01\x00\x5f\x5f\x73\x65\x74\x25\x64\x00\x08\x00\x04\x00\x00\x00\x00\x04\x14\x00\x01\x80\x0c\x00\x01\x00\x63\x6f\x75\x6e\x74\x65\x72\x00\x04\x00\x02\x80"}, {{len=20, type=NFNL_MSG_BATCH_END, flags=NLM_F_REQUEST, seq=9, pid=0}, "\x00\x00\x0a\x00"}], iov_len=624}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 624
select(4, [3], NULL, NULL, {tv_sec=0, tv_usec=0}) = 1 (in [3], left {tv_sec=0, tv_usec=0})
recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=0x000040}, msg_namelen=12, msg_iov=[{iov_base={{len=104, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWCHAIN, flags=0, seq=5, pid=3583}, "\x02\x00\x00\x07\x08\x00\x01\x00\x6e\x61\x74\x00\x0c\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x01\x0f\x00\x03\x00\x70\x72\x65\x72\x6f\x75\x74\x69\x6e\x67\x00\x00\x14\x00\x04\x00\x08\x00\x01\x00\x00\x00\x00\x00\x08\x00\x02\x00\x00\x00\x00\x00\x08\x00\x05\x00\x00\x00\x00\x01\x0b\x00\x07\x00\x66\x69\x6c\x74\x65\x72\x00\x00\x08\x00\x06\x00\x00\x00\x00\x04"}, iov_len=4096}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 104
fstat(1, {st_dev=makedev(0, 22), st_ino=3, st_mode=S_IFCHR|0620, st_nlink=1, st_uid=1000, st_gid=5, st_blksize=1024, st_blocks=0, st_rdev=makedev(136, 0), st_atime=1579772816 /* 2020-01-23T10:46:56.597292734+0100 */, st_atime_nsec=597292734, st_mtime=1579772816 /* 2020-01-23T10:46:56.597292734+0100 */, st_mtime_nsec=597292734, st_ctime=1579772494 /* 2020-01-23T10:41:34.597292734+0100 */, st_ctime_nsec=597292734}) = 0
select(4, [3], NULL, NULL, {tv_sec=0, tv_usec=0}) = 1 (in [3], left {tv_sec=0, tv_usec=0})
recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=0x000040}, msg_namelen=12, msg_iov=[{iov_base={{len=104, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWCHAIN, flags=0, seq=6, pid=3583}, "\x02\x00\x00\x07\x08\x00\x01\x00\x6e\x61\x74\x00\x0c\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x02\x10\x00\x03\x00\x70\x6f\x73\x74\x72\x6f\x75\x74\x69\x6e\x67\x00\x14\x00\x04\x00\x08\x00\x01\x00\x00\x00\x00\x04\x08\x00\x02\x00\x00\x00\x00\x64\x08\x00\x05\x00\x00\x00\x00\x01\x0b\x00\x07\x00\x66\x69\x6c\x74\x65\x72\x00\x00\x08\x00\x06\x00\x00\x00\x00\x00"}, iov_len=4096}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 104
select(4, [3], NULL, NULL, {tv_sec=0, tv_usec=0}) = 1 (in [3], left {tv_sec=0, tv_usec=0})
recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=0x000040}, msg_namelen=12, msg_iov=[{iov_base={{len=100, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWSET, flags=0, seq=7, pid=3583}, "\x02\x00\x00\x07\x08\x00\x01\x00\x6e\x61\x74\x00\x0b\x00\x02\x00\x5f\x5f\x73\x65\x74\x33\x00\x00\x0c\x00\x10\x00\x00\x00\x00\x00\x00\x00\x00\x09\x08\x00\x03\x00\x00\x00\x00\x03\x08\x00\x04\x00\x00\x00\x00\x07\x08\x00\x05\x00\x00\x00\x00\x04\x0a\x00\x0d\x00\x00\x04\x02\x00\x00\x00\x00\x00\x0c\x00\x09\x00\x08\x00\x01\x00\x00\x00\x00\x04"}, iov_len=4096}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 100
select(4, [3], NULL, NULL, {tv_sec=0, tv_usec=0}) = 1 (in [3], left {tv_sec=0, tv_usec=0})
recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=0x000040}, msg_namelen=12, msg_iov=[{iov_base={{len=60, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWSETELEM, flags=0, seq=0, pid=3583}, "\x02\x00\x00\x07\x08\x00\x01\x00\x6e\x61\x74\x00\x0b\x00\x02\x00\x5f\x5f\x73\x65\x74\x33\x00\x00\x14\x00\x03\x00\x10\x00\x01\x00\x0c\x00\x01\x00\x08\x00\x01\x00\x54\x4d\x28\x84"}, iov_len=4096}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 60
select(4, [3], NULL, NULL, {tv_sec=0, tv_usec=0}) = 1 (in [3], left {tv_sec=0, tv_usec=0})
recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=0x000040}, msg_namelen=12, msg_iov=[{iov_base={{len=60, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWSETELEM, flags=0, seq=0, pid=3583}, "\x02\x00\x00\x07\x08\x00\x01\x00\x6e\x61\x74\x00\x0b\x00\x02\x00\x5f\x5f\x73\x65\x74\x33\x00\x00\x14\x00\x03\x00\x10\x00\x01\x00\x0c\x00\x01\x00\x08\x00\x01\x00\xb0\x1f\x35\x63"}, iov_len=4096}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 60
select(4, [3], NULL, NULL, {tv_sec=0, tv_usec=0}) = 1 (in [3], left {tv_sec=0, tv_usec=0})
recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=0x000040}, msg_namelen=12, msg_iov=[{iov_base={{len=60, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWSETELEM, flags=0, seq=0, pid=3583}, "\x02\x00\x00\x07\x08\x00\x01\x00\x6e\x61\x74\x00\x0b\x00\x02\x00\x5f\x5f\x73\x65\x74\x33\x00\x00\x14\x00\x03\x00\x10\x00\x01\x00\x0c\x00\x01\x00\x08\x00\x01\x00\x51\x13\x60\x94"}, iov_len=4096}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 60
select(4, [3], NULL, NULL, {tv_sec=0, tv_usec=0}) = 1 (in [3], left {tv_sec=0, tv_usec=0})
recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=0x000040}, msg_namelen=12, msg_iov=[{iov_base={{len=60, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWSETELEM, flags=0, seq=0, pid=3583}, "\x02\x00\x00\x07\x08\x00\x01\x00\x6e\x61\x74\x00\x0b\x00\x02\x00\x5f\x5f\x73\x65\x74\x33\x00\x00\x14\x00\x03\x00\x10\x00\x01\x00\x0c\x00\x01\x00\x08\x00\x01\x00\x8a\x64\x3e\x08"}, iov_len=4096}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 60
select(4, [3], NULL, NULL, {tv_sec=0, tv_usec=0}) = 1 (in [3], left {tv_sec=0, tv_usec=0})
recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=0x000040}, msg_namelen=12, msg_iov=[{iov_base={{len=204, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWRULE, flags=0, seq=8, pid=3583}, "\x02\x00\x00\x07\x08\x00\x01\x00\x6e\x61\x74\x00\x0f\x00\x02\x00\x70\x72\x65\x72\x6f\x75\x74\x69\x6e\x67\x00\x00\x0c\x00\x03\x00\x00\x00\x00\x00\x00\x00\x00\x0a\x94\x00\x04\x00\x34\x00\x01\x00\x0c\x00\x01\x00\x70\x61\x79\x6c\x6f\x61\x64\x00\x24\x00\x02\x00\x08\x00\x01\x00\x00\x00\x00\x01\x08\x00\x02\x00\x00\x00\x00\x01\x08\x00\x03\x00\x00\x00\x00\x0c\x08\x00\x04\x00\x00\x00\x00\x04\x30\x00\x01\x00\x0b\x00\x01\x00\x6c\x6f\x6f\x6b\x75\x70\x00\x00\x20\x00\x02\x00\x0b\x00\x01\x00\x5f\x5f\x73\x65\x74\x33\x00\x00\x08\x00\x02\x00\x00\x00\x00\x01\x08\x00\x05\x00\x00\x00\x00\x00\x2c\x00\x01\x00\x0c\x00\x01\x00\x63\x6f\x75\x6e\x74\x65\x72\x00\x1c\x00\x02\x00\x0c\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0c\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00"}, iov_len=4096}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 204
select(4, [3], NULL, NULL, {tv_sec=0, tv_usec=0}) = 0 (Timeout)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about setting NLM_F_ACK
or NLM_F_ECHO
on the batch start and batch end messages? That way, you should be able to read until you encounter the sequence number for the batch end message.
return fmt.Errorf("Receive: %w", err) | ||
// Retrieving of seq number associated to entities | ||
entitiesBySeq := make(map[uint32]Entity) | ||
for i, e := range cc.entities { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. I’m still not particularly happy about having to manage messages
and entities
, which are separate but connected.
Instead of the current approach, we could make messages
take an interface type which has a NetlinkMessage
method. putMessages would then wrap a netlink.Message
in a type which implements NetlinkMessage
, and can optionally implement HandleResponse
as well.
Damn I need this! Any changes we can get this merged? 🙏 |
I'm not familiar enough with the codebase to help resolve the merge conflicts :/ |
There was some reluctance about it. I could continue this work if it's deemed useful. |
Please do 🙏 I could really use the improvement you set out to fix here 🙇♂️ I think the comments from @stapelberg should be easily addressable 👌 |
@alexispires Just an FYI, this is where I want to use this: https://git.mills.io/prologic/box/issues/18 so it would be awesome if you could revive this PR of yours, rebase it and hopefully we can get it merged soon 🤞 Let me know if I can help in any way, I'm not that familiar with the codebase (let alone nftables 🤣) but my Go is "okay" 😅 |
I'm still waiting opinion (if it changed) from the maintainers about this feature. However you can fix your problem by doing as this:
|
Is that @stapelberg ?
Ahh brilliant! I didn't know you could do this 😅 |
@alexispires If you at least rebase this PR I can also test it in my project 👌 I'm not sure I wouldn't screw up the merge conflicts myself 😂 |
It’s been a long time since I last looked at this, so reading up always takes some time. Maybe a summary in the first post with open questions/issues would be helpful. I will note that I’m not against this feature per se, but we need to introduce it in a way that doesn’t cause breakage. There seem to be still a number of unaddressed code review comments. To move this forward, rebasing the PR and replying to the comments would be the first step :) |
Anyway, this PR is important for us. As for now, use this code as described above: var errNotFound = errors.New("not found")
// When a new rule is added or inserted, we do not know its handle.
// Use this method to get its handle if you want to delete it later.
func FindRule(c *nftables.Conn, rule *nftables.Rule) (*nftables.Rule, error) {
rules, err := c.GetRules(rule.Table, rule.Chain)
if err != nil {
return nil, err
}
for _, r := range rules {
if bytes.Equal(r.UserData, rule.UserData) {
for idx, e := range r.Exprs {
if _, ok := e.(*expr.Lookup); ok {
continue
} else if !reflect.DeepEqual(e, rule.Exprs[idx]) {
goto next
}
}
return r, nil
}
next:
}
return nil, errNotFound
} rule := conn.InsertRule(&nftables.Rule{...})
err := conn.Flush()
if err != nil {
return nil, err
}
rule, err = FindRule(conn, rule)
if err != nil {
return nil, err
} |
Currently this very basic and useful workflow is not possible:
The solution:
Because of
NLM_F_ECHO
all netlink messages in the transaction are returned back. In the case of rules, there are returned back with Handle. Rules are linked to requests by keeping the index of netlink request of a rule. We rely on sequence number to retrieve the right request when receiving a netlink response.What does the solution allow?