top of page
Coding Station

Why a different approach is needed to troubleshoot Microsoft Teams



Messaging platforms have benefited hugely from the pandemic by providing online meeting locations to see, share and collaborate with one another. The need to work remotely has meant many of us now depend on them more than ever. For network teams, their users have moved, and it has introduced a huge challenge. Poor network quality can now be seen and heard by everyone.


In the space of just 5 years, Microsoft Teams has become a dominant collaboration tool that is being used across education, public and private sectors. With around 20M users in 2017, it now boasts over 270M users, and that number will hit 300M by 2023.


Network teams around the globe have had to handle the surge in collaboration application connectivity and it has added a huge headache with many still lacking any decent level of visibility. Monitoring levels range from guessing, hoping and having a fair idea of when and where; to being able to track down infrastructure, regions or individual users.


It's easy to forecast when the network is under pressure, normally on the hour when all meetings start, participants all click "join". But knowing where those pressure points are, is the real question they need to answer. Asking users to plan meetings at 5 past the hour helps to reduce contention, but doesn't remove the actual weight or improve call quality.


The headache for network teams:

1. More moving parts and components they are not responsible for

2. Connectivity is more complex to troubleshoot

3. Network perimeter now extends into homes with poor wifi or broadband contention

4. Problematic hardware, overloaded or poor performing audio devices, web and client connections

5. Meetings have many participants that connect from anywhere, it takes 1 person to reduce quality.

6. Meetings introduce non predictable weight to the network. Voice, Video, Desktop Sharing, chat.

7. Lack access to the collaboration tool data/admin logs to correlate with existing tools

8. Timings and audiences are sporadic, along with other network activity

9. Service providers offer reactive services and do not offer root cause

10. Sheer volume of complaints, the network radar has shifted to end users


So, what tools or resources do the network team or service providers typically resort too?


Microsoft, provide a network assessment tool here. It's a tool that basically tells you that you might have some latency, jitter or loss on the line. It doesn't run continuously, it doesn't run from every user location, it doesn't simulate a call of any type. It basically just confirms what every network engineer already knows. Before any basic troubleshooting takes place, network engineers must also ensure their network is prepared to handle and priorities the call types. Zero point looking at anything if the basics are not applied.


Netflow. Unless a user is sat inside managed network locations and using ms teams, there are no netflows. It's no use having 1 or 2 users and their connectivity monitored because the chances of them having an issue on an empty corporate network is low. This telemetry is useless if your users are all working from home. If there are 1 or 2 users only in the office, the elimination game would already rule out corporate network anyway.

Packets. Like netflow above, packets are only useful when connectivity is passing through the devices with taps/spans. It's an expensive way to collect and troubleshoot and the chances of finding root cause is very low. Once the conversation leaves the span/tap, the connectivity onwards is not traced. The packets use case is more valuable in datacenters where applications and services are actually hosted.

DEM Tools. Vendors like Aternity, Nexthink and even SASE tool vendors like Zscaler have agents that can sit on every endpoint. A small subset of the agent will focus on the activities within MS Teams. Opening teams, joining a meeting, sharing desktop. They can even get a call ID from the API when integration is enabled. Without an agent, there is not data. If a user doesn't use teams, there is no data. If a user joins using the web interface, there is no data. BYOD devices, mobile or web devices no data. DEM Tools provide only some puzzle parts and can help to rule out device factors. They can help to correlate groups of users in a certain location that may have poor experience at the same time across many applications...they do not give you root cause, unless they all share the same audio driver that causes call drops.(rare)

Synthetics. If using web only versions of Microsoft Teams, then synthetics are a great way to monitor and simulate connectivity. There are many tools like Thousandeyes or Kentik that can simulate connectivity to Microsoft and provide a detailed path view with loss and latency hotspots. The problem is not many of these tools actually simulate a UDP connection or a create a real teams call. They give you a feel for connectivity in terms of path and can "sometimes" indicate where misconfiguration of QOS may be. However, given the choice of DEM tools, packet or netflows, Synthetics are the fastest way to rule out network.


API. For some network engineers with big data experience, its quite easy to query the MS Teams API and analyse the call data. Its rich, has call by call data that includes who was in a particular meeting. However, it's not continuous, requires time to understand the data and then to correlate the bad calls with bad performance in a timeseries format. It's not the best way for network engineers to spend their time either.


What are the alternatives?

There are only a few approaches left to fully understand end to end connectivity and call performance within Microsoft Teams. Companies like ir.com have leveraged the API approach to provide call center dashboards, but lack the ability to synthetically test or simulate a real teams call with a hop by hop network view. Analytics365 also have a great dashboard view of teams, but lack network or other collaboration tools such as Webex or Zoom. No-one yet has a single solution to fully monitor Microsoft Teams, so a different approach is needed.



Vendor Spotlight:



Hortium seem to have done something very special. After our first demo, we were blown away. The team have a very deep network understanding and took that knowledge to shape a superb offering that is heavily focussed on "any" digital experience. If you have spent time trying to troubleshoot Microsoft Teams, then you understand the complexities. Probably read the first section of this post thinking yeah yeah.... heard it all before. Then this is for you.


Synthetics - Yes Synthetics hop by hop - Yes Synthetic Real Calls - Yes Call Metrics per call/user - Yes Real Time & Historical - Yes Call Center Analytics - Yes Endpoint Device Support - Yes


The best thing about Hortium is that it isn't just focussed on Microsoft Teams, it supports all other popular collaboration tools and telephony. Other collaboration products - Yes





The clever people behind the scenes really understand why the network is always blamed for poor call performance. Hortium can actually simulate an audio, video or desktop sharing call. They can also bring in call data from the API to pinpoint which user lowered call quality and the reason why. Agents are supported on both servers and workstations. Hortium also delivers an end-to-end path view across your network that indicates paths falling below the recommended quality baseline.

Audio, Video synthetics with network hop by hop - Yes (thats over UDP with the correct DSCP Marking)


Single Pane of Glass. In a similar way to DEM solutions, Hortium also delivers a brilliant single pane of glass for digital experience. Hortium has built its platform on the ability to simulate transactions from any device at any time. Where Lakeside, Aternity & Zscaler fall short, Hortium has filled the huge network telemetry gap with high fidelity reachability data by path. There is no other tool yet that sits between NPM and DEM in the market like this.


Visibility Platforms are a worldwide partner of Hortium.


We are authorised to deliver digital transformation services. Those services include troubleshooting engagements, assessments, migrations and fully managed services that provide a 24/7 monitoring and assurance.




If you would like to know more about this blog, how we can help your organization to improve Microsoft Teams experience, then drop us an email info@visibilityplatforms.com

Comments


bottom of page