They are the full traces @pippincp , think I just didn't copy and paste the trace complete line. You can see the final IP is the same as the start IP. My post just before your one was just me comparing the traceroutes from the above post, when it was congested it used either redbus.ukcore.bt.net or slough.ukcore.bt.net rather than telehouse.ukcore.bt.net.
I just did one to bbc again:
Tracing route to bbc.co.uk [126.96.36.199]
over a maximum of 30 hops:
1 <1 ms <1 ms <1 ms 192.168.1.254
2 4 ms 3 ms 3 ms 172.16.10.180
3 * 6 ms 6 ms 188.8.131.52
4 7 ms 6 ms 6 ms 184.108.40.206
5 7 ms 6 ms 6 ms core1-hu0-8-0-1.colindale.ukcore.bt.net [220.127.116.11]
6 8 ms 7 ms 8 ms 18.104.22.168
7 * * * Request timed out.
8 7 ms 7 ms 7 ms 22.214.171.124
Before, when I had congestion problems it was:
Tracing route to bbc.co.uk [126.96.36.199]
over a maximum of 30 hops:
1 <1 ms <1 ms <1 ms 192.168.1.254
2 4 ms 3 ms 4 ms 172.16.10.180
3 * * * Request timed out.
4 7 ms 7 ms 7 ms 188.8.131.52
5 7 ms 6 ms 6 ms core2-hu0-2-0-3.colindale.ukcore.bt.net [184.108.40.206]
6 8 ms 8 ms 7 ms peer3-et-7-0-4.redbus.ukcore.bt.net [220.127.116.11]
7 * * * Request timed out.
8 7 ms 6 ms 6 ms 18.104.22.168
But like I said, it might be something, but could also be cause it's actually using a different IP for BBC. Can't really get much from ICMP anyway, just thought it was interesting to add.
Thank god I’ve come across your post. I’m experiencing the same issues you’ve posted. Had a Openreach engineer out last week found nothing wrong. But still having issues it’s driving me crazy tbh in the evenings.
if you are having similar problems with download speed at peak times then engineer visit is not going to find anything (as you have found out). You need to start your own thread and post router stats and btspeedtester results using Ethernet connection when download speed ok and again when poor
you also need to post dslchecker results using phone number and leave exchange and cab number
we can then take it from there
Another update, I had the Openreach engineer visit, the line was checked and came back all good. The mods NeilO, is not giving up though and has now got another team to look into this and they are monitoring the throughput on my line for 7 days to try and work out where the packet loss is coming from. I keep suspecting my own PC, but then I get the same results on my phone, at the same time and again, it's only at peak times.
I'm starting to suspect the FTTC cabinet fibre link capacity to the exchange, I know Openreach are supposed to monitor this all the time, but it is strange that I had the problem with TalkTalk also. I have no idea if they use 1GBs links to each port card or 10Gbs, but given how much people use Netflix etc, I could see how a 1GBs link could get saturated at peak times.
I found a program called StartTrinity continuous speed test. it claims to continually download/upload to various servers to monitor packetloss and ping.
You can set a limit on the bandwidth. So at first I set it to 10mb/2mb and I got no problems:
Then I set the limit to what my sync speed is 55mb/10mb, and these are the current results showing the packet loss and high ping:
It does seem as though the packet loss is causing the TCP connection to throttle itself back to maintain a stable speed of 25mb ish. This at about half 6pm. I can only show you screenshots, but I've had it running for about 20minutes and I can see the packetloss jump around from 0.10 to over 2% and the speed reflects the amount of packetloss.
I will keep it running tonight, hopefully it will help the team monitoring the connection.
And now at 11.30PM:
The speeds do still drop every now and then, but it's much less now. As before I kept the BT wholesale report as it was, so that they have that info as the latest.
It's annoying that the ping tests using ICMP don't pick up any packetloss, so I can't identify myself where it's happening, the actual ping times when I use the CMD prompt are decent, but I guess they only send small amounts of bytes.
I had a look at the https://aastatus.net/ and done a keyword search for "10g" and it does seem BT are upgrading their links from 1g to 10g to cope with covid in multiple sites and I did wonder if they were using 1gb links. Looks like they are seeing issues in some areas. For example https://aastatus.net/36331 . I don't have much knowledge on the subject, but I'm guessing the cabs use 1g links and in usual circumstances this would be enough, but on a cab with 200 to nearly 400 connections all using TV services, netflix etc, it's going to fill up pretty quick and I know that my PCP cabinet has two FTTC cabinets connected to it as well as a g.fast pod on the side, so at least one of them is full and my wonder is if Openreach has kept up with the capacity of adding in fibre links or if it's the ISP's responsibility to order it.
After a bit of research I found out my cabinet can host 288/384(think it's a 288 though, not 100% sure without getting a closer look) connections, on 6 line cards. If each line card has a 1g fibre link back to the exchange, it would give a speed of 20.83mb for each of the 48 connections if they were all in use at the same time, that's if I'm correct in thinking each line card get's it's own 1gb link. If it's 1gb for the whole cab then it's actually 3.47mb if all were in use. It would also make sense as to why BT can't see the congestion at the SVLAN, as the speed is dropped at the cabinet fibre link. They should have some kind of access to the DSLAM in the cabinet though to monitor capacity and dropped packets and Openreach themselves should also be on top of it, but given Covid, maybe there is delays, I don't know.
As far as I know the sync rate is negotiated via the router to the DSLAM in the cabinet and is more to do with the copper than the fibre to the exchange, so I could see how this could go unseen. Engineers come out at off-peak times, do their connection tests to the exchange, see no issues with capacity and off they go.
I have seen plenty of people on the forum saying it's impossible for the congestion to be at the cabinet, but it simply can't be true in some circumstances, especially if you have all those connections on fibre links that do not have enough capacity at peak times.
I could be wrong, so if anyone knows the actual process of how fibre links are added to the cabinets and at what bandwidth, please let me know.
After some further digging I found https://my.btwholesale.com/pages/static/products-services/broadband.htm and it does seem as though there is some kind of minimum throughput SLA, oddly enough the minimum being 20mb. I also found an old 2014 WBC FTTC handbook from BT Wholesale, with a chart of the minimum throughput SLA options, depending on package, with 20mb being their minimum best effort, moving up to 60mb on an elevated package.
I used WinMTR tonight to try and figure out where the start of the problems is. At first I thought it was my router, as seen in this screenshot:
So I double checked and pinged the router itself and got:
I found out the next hop in the chain is the "This is the address for the network element that assigns your BT Smart Hub an IP address." So I'm guesssing that is the DSLAM/BRAS in the FTTC cabinet and it's there that the latency shoots up and causes slow packets, that I'm assuming is due to bufferbloat. It does support my idea that this is a contention problem with the FTTC cabinet at peak times, rather than the BT exchange, along with the BT data of no congestion issues at the exchange. Either the cabinet is completely full and BT can't order any more bandwidth for it, or they need to. If that can't be done then they need to get Openreach to upgrade the fibre links to 10g to support more bandwidth. I could be wrong and that IP may be the exchange edge router, but it has it in my hub admin with that description and I'm pretty sure it is the DSLAM that creates the WAN IP. The reason my PC reports lost packets is because of the latency caused by the bufferbloat, it only waits so long and it then slows the TCP connection, as it should to keep it stable.
BT could also try moving me to a different line card, although that is hit or miss.
Is there any way for BT staff to find out what bandwidth the cabinet has to the exchange? Is it 1g or 10g? If it's 1g, I can understand there being contention, especially with a full cabinet, if it's 10g, or enough not to cause problems, then I must be on one hell of a busy line card, or there is an issue with the DSLAM itself and Openreach need to be made aware, as they are suppose to monitor them though, I'm going to take a guess and say it's BT that doesn't have enough bandwidth at the DSLAM.
I know that my PCP has two FTTC cabinets, so at least one of them is, or was full as one point. This also explains why I had issues on TalkTalk also a couple of months ago.
In terms of speed measurements, tonight wasn't as bad as Friday but you can see where it starts:
I actually used 10 TCP connections/sockets to reach those speeds, more than most normal speed testers, as I wanted to fully saturate the link.
Here is a ThinkBroadband test:
I emailed someone high up at Openreach a couple of weeks ago. They seemed really on the ball and promised to look into it. I just got a reply from a senior manager on network health.I removed any confidential info.
"Good morning. "Name" has asked me to respond to you regarding the issue you have raised and detailed below.
Our Network Operations Team have been monitoring the DSLAM backhaul link and have confirmed they are seeing capacity issues at certain times which could be causing the speed degradation you have described.
They have a plan in place to provide further capacity to this DSLAM which involves an additional network fibre and the installation of some exchange hardware. They have provided a provisional completion date for these works as the end of September.
Whilst I understand that the problem may continue until this work has been completed; please be assured that I will personally monitor the progress and will inform you of any changes to the completion date."
Please feel free to contact me directly should you have any comments or concerns."
Looks like it was an issue at the cabinet to the exchange.