Transistors, MOSFETs etc have a specificaiton for maximum power dissipation. It's normally in the "absolute maximums" section, early on in the data sheet. Exceeding this specification, even briefly, can result in the immediate loss of the Magic MOSFET Smoke (TM).
This is the reason why you need to hit the gate with a high positive or negative current, to make the MOSFET switch quickly, if you're switching a high-current load.
If the MOSFET is designed for high power dissipation, you will usually find graphs in the data sheet that will help you understand the nature of the limits that you must not exceed. There are limits on voltage, current, power, and often a Safe Operating Area (SOA) limit as well. Look at some data sheets for high-power transistors and MOSFETs for guidance.
This is a separate issue from heatsinking. Because of thermal inertia, the heat issue relates to the average power dissipation over a period of time. For example if your MOSFET dissipates 10W when it's ON, but it's only ON for 1 ms every second, the average dissipation is only 10 mW, so that determines the amount of heatsinking you need.
On the other hand, if it's dissipating 10W for one minute every 1000 minutes, it's still running a 0.1% duty cycle, but you have to heatsink it based on 10W dissipation to prevent it from overheating during that one minute period.
You need to calculate the heatsink based on the maximum junction temperature, and the thermal resistance at various interfaces. There is thermal resistance between the junction and the package metal, then between the package metal and the heatsink, and finally, between the heatsink and the ambient temperature. These thermal resistances all add together.
You should also consider that high junction temperatures will reduce the life expectancy of the component significantly, and that high case temperatures can cause discolouration and loss of insulating quality in the PCB.