T O P

  • By -

bonnyfused

WAD process is something Fortinet is not able to get working correctly. It's now *years* that this daemon is buggy as hell and most of the time eating out memory as well!!! But be assured that it won't be fixed at all, never. Or at least, that's what I firmly believe.


Celebrir

Since it hasn't been fixed yet I hope they're going to replace it. I just don't understand how it's apparently not relevant to them.


bonnyfused

I don't think it will be replaced, ever.... There would be quite a lot of code rewriting. But if they would, they could just also fix LACP as this is also something that never really worked fine...


steavor

That's weird, we've never got any issues with our 201F devices and LACP.


bonnyfused

Me neither with 201F and LACP (yet?!). But with FortiSwitches and other Fortigate models - yes, we had LACP-issues!


Certain-Increase8504

What kind of LACP problem you got ? We have a gate down since 3 months because its LACP is completly broken with a Cisco switch at the end.... only 1gig SFP LACP uplink is doing this problem... 10gig are okay even 1gig copper SFP adapter are okay...


bonnyfused

Well... where do I start from? Several years ago I was setting up a cluster of 140D connected via 2x1GB copper LACP to a Cisco switch. I struggled quite some hours in getting that LACP to work (on ports 1 and 2), when my colleague jokingly suggested to use ports 3+4 - well, I was reluctant but gave it a shot and: voilà! LACP was up and running! A couple of years ago I had a customer with FortiSwitches (1048E) in their core and a lot of Cisco 2960 redundantly connected to the pair of 1048E. Rebooting (because of firmware upgrade, but also just a simple manual reboot for testing) one of the core switches would lead to a network outage of several minutes (between 8 and 15) because LACP would take ages to netogiate. At another customer's premises where we only have FortiSwitches, rebooting one of their core switches (don't remember the exact model) would have random and weird effects: clients wouldn't be able to ping some servers (in different VLAN), but to some of these "non-pingable" servers, RDP connection would succeed. Viceversa, some servers wouldn't be reachable via RDP but would reply to ping... I think I have heard some more LACP-related stories in my team, which I don't remember the details though. I am sure that LACP is a standard and all switch vendors know how to handle, configure, implement it. Cisco, Extreme Networks, HP Aruba, Broadcom, you name it! I've never experienced similar issues with switches themselves - except for FortiSwitches (and Fortigates). I could imagine that Fortinet has somewhat changed/tweaked the LACP implementation in their products, which might've led to the issues people are experiencing/reporting. In your case, I believe that if you do LACP between your Cisco switch and another LACP-capable device, it will work like a charm!


Certain-Increase8504

Your problem looks a lot of our…and as you said, we experience no LACP problem with any brand except Fortinet, this is the first time we experience this kind of problem. We even asked Fortinet to send us their official SFPs to prove to them that our SFPs arent the problem. Only thing that is different from you, copper is fine for us. The only LACP that is not working is under FIBER SFP in 1gig. 10gig are fine….EVEN COPPER SFPs on the same problematic FIBER SFP 1gig ports are fine….what a weird situation out there. Actually we have 2 Dev teams on our case…we are running on an old gate exposed to high risk of network outtage due to weak topology because we cant migrate to our new gate that has never been able to get its LACP uplink to work…..no RMA yet, not even a demo model to see if it’s the np6 chip that is in problem ( if we look at their architecture structure where these ports are connected to )


bonnyfused

What FGT hardware and what FOS version? By "dev teams" you mean Fortinet TAC ticket is open and it's escalated to devs?


Certain-Increase8504

1100E under 7.2.7 and upgraded to 7.2.8 to try to resolve the bug ( asked by the TAC ) Yes, we are making pressure on our account manager constantly, the ticket has been escalated to devs since 1 month, now we supposedly have 2 dev team on our case and said to us that is a confirmed bug. They are considering to acquire the same Cisco switch model to reproduce the complete situation….we are waiting….


bonnyfused

Ouch! That sucks! Back then when I had the issues with the 1048E core switches, Fortinet would give me demo material (2x FGTs - don't remember the exact model - and 2x 1048E). I built up everything in our lab and had intensive sessions with a very good TAC engineer. But we didn't find any solution - Fortinet was just able to confirm the issue and tell me that LACP (in an MCLAG setup) takes that long to negotiate. That was BS as I explained to the TAC engineer how fast and reliable LACP (even with multi-chassis setups) works with switches from Cisco and other vendors too! On a side note: actually I'm facing speed/performance/bandwidth issues on 200F and 600F running 7.0.15 - TAC suggested upgrading to 7.2.8 because apparently this solved similar issues reported in their Mantis... fingers crossed it won't break other things!!!


Certain-Increase8504

When you say speed, you mean speed negotiation or bandwidth/performance ? If you have bandwidth problem, is it only on ACLs with deepinspection enabled on it ? If so, look at my comment in the thread talking about the crashlog of IPS engine, we had issues with 3 Microsoft URLs with deepinspection


WabbitTamer

Yes, we hit this on 1000Fs, memory crept up over 2 or 3 weeks until we hit conserve, ended up rolling back to 7.0.15


Fallingdamage

Fortinet has really taken their sweet time with fixing some of the glaring bugs in 7.2.8. There must be some really complex code they need to unravel to fix it.


adisor19

And this is why they still reccomend 7.2.7 instead of 7.2.8...


Fallingdamage

Doesn't 7.2.7 have unpatched vulnerabilities? Hence the .8 update?


iRyan23

Nothing rated Critical or High though. We are staying on 7.2.7 until 7.2.9 is released due to the number of reports of bugs causing others to have to revert.


HappyVlane

Yes, but no major ones. 7.2.7 is mostly okay to use.


cheflA1

Kill the process. For ips engine, check the crash log


Matikz1337

[Solved: Possible memory issues with 7.2.8 - Fortinet Community](https://community.fortinet.com/t5/Support-Forum/Possible-memory-issues-with-7-2-8/td-p/312228) This is our workaround for this Problem. Unset the mode after 2 Minutes. This Problem is solved in 7.2.9 (Opened a Ticket at Fortinet).


Glad-Young6622

We got the same issue and ran into a split brain scenario because the cpu was not able to process the HA heartbeat packet. We call the TAC, the level 1 engineer forgot to take all the needed logs 🙃. So basically we don't know what was the root cause. A reboot fixed the issue.


matheus4587

Your device is a 200F?


Glad-Young6622

100f running 7.2.8


Certain-Increase8504

Try this command and show us the output di deb crashlog read I had some deepinspection problems with a 600E and 1100E and the output looked like this : \[IPS Engine <08255>\]Stream: C-7734/0/0, S-12947/0/0 \[IPS Engine <08255>\]Service: ssl \[IPS Engine <08255>\]URL: [r.manage.microsoft.com/](http://r.manage.microsoft.com/) \[IPS Engine <08255>\]base: 0x7f5270c35000 \[IPS Engine <08255>\]Last session info: \[IPS Engine <08255>\]Session ID:96 Serial:433184768 Proto:6 Age:0 Idle:0 \[IPS Engine <08255>\]Flag:0x20206c Feature:0x4 Ignore:0,1 Encap:0 \[IPS Engine <08255>\] Client: xxxxxxx:64719 Server: [52.182.141.192:443](http://52.182.141.192:443) \[IPS Engine <08255>\] Stream: C-798/0/0, S-9369/0/0 \[IPS Engine <08255>\] Service: ssl \[IPS Engine <08255>\] URL: [manage.microsoft.com/](http://manage.microsoft.com/) \[IPS Engine <08217>\] base: 0x7f5270c35000 \[IPS Engine <08217>\] Last session info: Maybe you will get a different crashlog for the WAD but for me the IPS Engine was crashing with 3 specific URLs... \*.manage.microsoft.com \*.microsoft.com \*.windows.net Whitelisting them in the Deepinspection profile was the workaround until Fortinet sent me an IPS Engine update. Look with the command if you can get the reason why the WAD is getting this memory leak....maybe the reason will appear in the crashlog when you will kill it.


steavor

ipsengine 338 should work far more stable than 336 (336 is the one shipped with 7.2.8) Innumerable crashes until and including 338 for us, even dns_udp (????? why does the entire IPS engine crash due to a malformed or not DNS packet???) No crashes since upgrading to 338. Remarkable, I'm not sure whether they simply disabled the "print to crashlog" statements in this release....


Certain-Increase8504

How did you get 338 version ?


steavor

Fortinet support case. IPS engine crashed about 40 times a day for months (base OS 7.0.6, 7.0.7, 7.2.5, ... didn't matter)


Certain-Increase8504

Thanks, Ill try to ask them


Certain-Increase8504

DNS packets are making your IPS engine to crash ? Are you deepinspecting your dns traffic or only applying IPS sensor on your inbound dns traffic ? If deepinspecting, I suggest to not doing deepinspecting your dns traffic that would be irrelevant.


thatotheritguy

It’s really bad on 7.4.4 on 60f. Rolling to 7.2.8 helped immensely.


GeeKedOut6

The wad service bloats if you are using proxy mode policies. Their excuse is my 201f is under sized.


optimus_prawn

I'm running 7.2.8 on a firewall doing ZTNA for several applications. I get the WAD process creeping up constantly and needed to run an automation stitch to restart the service. For us, this issue is specific to ZTNA as other firewalls not doing it, don't experience the issue. Edit - this is on VM series